{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 05_Train_RESCAL\n",
    "#\n",
    "# created by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on February 27, 2023\n",
    "# updated by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on February 27, 2023\n",
    "#\n",
    "# 该脚本展示了如何在 DRKG 上训练模型 (RESCAL), 并利用网格搜索寻找到最优参数.\n",
    "#\n",
    "# 需要的包:\n",
    "#          torch\n",
    "#          dgl, version: 0.4.3\n",
    "#          dglke\n",
    "#          numpy\n",
    "#\n",
    "# 需要的文件:\n",
    "#          ./dataset\n",
    "#\n",
    "# 源教程链接: https://github.com/gnn4dr/DRKG/blob/master/embedding_analysis/Train_embeddings.ipynb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Training DRKG Using RESCAL\n",
    "\n",
    "这个 notebook 展示了如何在 DRKG 上训练模型 (RESCAL), 并利用网格搜索寻找到最优参数."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 导入需要的库"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 网格搜索参数\n",
    "\n",
    "我们能使用 DGL-KE 命令训练 RESCAL 模型, 关于如何使用 DGL-KE 的更多信息请参考 https://github.com/awslabs/dgl-ke.\n",
    "\n",
    "这里我们使用两个 GPU 训练模型.\n",
    "\n",
    "大约 100000 * 1.35 / 3600 = 37.5 h"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Reading train triples....\n",
      "Finished. Read 5286834 train triples.\n",
      "Reading valid triples....\n",
      "Finished. Read 293713 valid triples.\n",
      "Reading test triples....\n",
      "Finished. Read 293714 test triples.\n",
      "|Train|: 5286834\n",
      "random partition 5286834 edges into 2 parts\n",
      "part 0 has 2643417 edges\n",
      "part 1 has 2643417 edges\n",
      "/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dgl/base.py:25: UserWarning: multigraph will be deprecated.DGL will treat all graphs as multigraph in the future.\n",
      "  warnings.warn(msg, warn_type)\n",
      "|valid|: 293713\n",
      "|test|: 293714\n",
      "Total initialize time 16.469 seconds\n",
      "[proc 1][Train](1/100000) average pos_loss: 0.6931247711181641\n",
      "[proc 1][Train](1/100000) average neg_loss: 0.6931509971618652\n",
      "[proc 1][Train](1/100000) average loss: 0.6931378841400146\n",
      "[proc 1][Train](1/100000) average regularization: 0.0002658705343492329\n",
      "[proc 1][Train] 1 steps take 3.759 seconds\n",
      "[proc 1]sample: 0.217, forward: 2.488, backward: 0.033, update: 1.020\n",
      "[proc 0][Train](1/100000) average pos_loss: 0.6931571960449219\n",
      "[proc 0][Train](1/100000) average neg_loss: 0.6931506991386414\n",
      "[proc 0][Train](1/100000) average loss: 0.693153977394104\n",
      "[proc 0][Train](1/100000) average regularization: 0.000265888636931777\n",
      "[proc 0][Train] 1 steps take 3.930 seconds\n",
      "[proc 0]sample: 0.238, forward: 2.500, backward: 0.028, update: 1.164\n",
      "[proc 1][Train](2/100000) average pos_loss: 0.6931489706039429\n",
      "[proc 1][Train](2/100000) average neg_loss: 0.6931504607200623\n",
      "[proc 1][Train](2/100000) average loss: 0.6931496858596802\n",
      "[proc 1][Train](2/100000) average regularization: 0.00034831042285077274\n",
      "[proc 1][Train] 1 steps take 2.338 seconds\n",
      "[proc 1]sample: 0.237, forward: 0.251, backward: 0.005, update: 1.845\n",
      "[proc 0][Train](2/100000) average pos_loss: 0.6931456327438354\n",
      "[proc 0][Train](2/100000) average neg_loss: 0.6931527853012085\n",
      "[proc 0][Train](2/100000) average loss: 0.693149209022522\n",
      "[proc 0][Train](2/100000) average regularization: 0.00034804409369826317\n",
      "[proc 0][Train] 1 steps take 2.347 seconds\n",
      "[proc 0]sample: 0.213, forward: 0.219, backward: 0.008, update: 1.907\n",
      "[proc 0][Train](3/100000) average pos_loss: 0.69313645362854\n",
      "[proc 0][Train](3/100000) average neg_loss: 0.693150520324707\n",
      "[proc 0][Train](3/100000) average loss: 0.6931434869766235\n",
      "[proc 0][Train](3/100000) average regularization: 0.00034474628046154976\n",
      "[proc 0][Train] 1 steps take 1.336 seconds\n",
      "[proc 0]sample: 0.003, forward: 0.211, backward: 0.003, update: 1.119\n",
      "[proc 1][Train](3/100000) average pos_loss: 0.6931609511375427\n",
      "[proc 1][Train](3/100000) average neg_loss: 0.693151593208313\n",
      "[proc 1][Train](3/100000) average loss: 0.6931562423706055\n",
      "[proc 1][Train](3/100000) average regularization: 0.00034766201861202717\n",
      "[proc 1][Train] 1 steps take 1.534 seconds\n",
      "[proc 1]sample: 0.006, forward: 0.328, backward: 0.002, update: 1.198\n",
      "[proc 0][Train](4/100000) average pos_loss: 0.6931375861167908\n",
      "[proc 0][Train](4/100000) average neg_loss: 0.6931552886962891\n",
      "[proc 0][Train](4/100000) average loss: 0.6931464672088623\n",
      "[proc 0][Train](4/100000) average regularization: 0.0003537691372912377\n",
      "[proc 0][Train] 1 steps take 1.504 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.275, backward: 0.002, update: 1.224\n",
      "[proc 1][Train](4/100000) average pos_loss: 0.6931264996528625\n",
      "[proc 1][Train](4/100000) average neg_loss: 0.6931558847427368\n",
      "[proc 1][Train](4/100000) average loss: 0.6931412220001221\n",
      "[proc 1][Train](4/100000) average regularization: 0.0003516373981256038\n",
      "[proc 1][Train] 1 steps take 2.420 seconds\n",
      "[proc 1]sample: 0.004, forward: 0.245, backward: 0.003, update: 2.167\n",
      "[proc 0][Train](5/100000) average pos_loss: 0.693074107170105\n",
      "[proc 0][Train](5/100000) average neg_loss: 0.6931430697441101\n",
      "[proc 0][Train](5/100000) average loss: 0.6931085586547852\n",
      "[proc 0][Train](5/100000) average regularization: 0.00035293272230774164\n",
      "[proc 0][Train] 1 steps take 1.415 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.213, backward: 0.003, update: 1.197\n",
      "[proc 1][Train](5/100000) average pos_loss: 0.693065881729126\n",
      "[proc 1][Train](5/100000) average neg_loss: 0.6931426525115967\n",
      "[proc 1][Train](5/100000) average loss: 0.6931042671203613\n",
      "[proc 1][Train](5/100000) average regularization: 0.0003506076172925532\n",
      "[proc 1][Train] 1 steps take 1.350 seconds\n",
      "[proc 1]sample: 0.005, forward: 0.228, backward: 0.004, update: 1.113\n",
      "[proc 0][Train](6/100000) average pos_loss: 0.6930874586105347\n",
      "[proc 0][Train](6/100000) average neg_loss: 0.693145751953125\n",
      "[proc 0][Train](6/100000) average loss: 0.6931166052818298\n",
      "[proc 0][Train](6/100000) average regularization: 0.00035275903064757586\n",
      "[proc 0][Train] 1 steps take 1.346 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.207, backward: 0.002, update: 1.135\n",
      "[proc 1][Train](6/100000) average pos_loss: 0.6929855942726135\n",
      "[proc 1][Train](6/100000) average neg_loss: 0.6931449770927429\n",
      "[proc 1][Train](6/100000) average loss: 0.6930652856826782\n",
      "[proc 1][Train](6/100000) average regularization: 0.0003512033144943416\n",
      "[proc 1][Train] 1 steps take 1.339 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.223, backward: 0.003, update: 1.111\n",
      "[proc 0][Train](7/100000) average pos_loss: 0.6930510401725769\n",
      "[proc 0][Train](7/100000) average neg_loss: 0.6931166648864746\n",
      "[proc 0][Train](7/100000) average loss: 0.6930838823318481\n",
      "[proc 0][Train](7/100000) average regularization: 0.00035308609949424863\n",
      "[proc 0][Train] 1 steps take 1.296 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.220, backward: 0.002, update: 1.073\n",
      "[proc 1][Train](7/100000) average pos_loss: 0.6928874254226685\n",
      "[proc 1][Train](7/100000) average neg_loss: 0.6931042671203613\n",
      "[proc 1][Train](7/100000) average loss: 0.6929958462715149\n",
      "[proc 1][Train](7/100000) average regularization: 0.0003521911276038736\n",
      "[proc 1][Train] 1 steps take 1.421 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.236, backward: 0.003, update: 1.180\n",
      "[proc 0][Train](8/100000) average pos_loss: 0.692756175994873\n",
      "[proc 0][Train](8/100000) average neg_loss: 0.6930878758430481\n",
      "[proc 0][Train](8/100000) average loss: 0.6929219961166382\n",
      "[proc 0][Train](8/100000) average regularization: 0.0003486879577394575\n",
      "[proc 0][Train] 1 steps take 1.290 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.198, backward: 0.002, update: 1.089\n",
      "[proc 1][Train](8/100000) average pos_loss: 0.6926063299179077\n",
      "[proc 1][Train](8/100000) average neg_loss: 0.693039059638977\n",
      "[proc 1][Train](8/100000) average loss: 0.6928226947784424\n",
      "[proc 1][Train](8/100000) average regularization: 0.00035406905226409435\n",
      "[proc 1][Train] 1 steps take 1.440 seconds\n",
      "[proc 1]sample: 0.004, forward: 0.222, backward: 0.004, update: 1.210\n",
      "[proc 0][Train](9/100000) average pos_loss: 0.6923798322677612\n",
      "[proc 0][Train](9/100000) average neg_loss: 0.6929154992103577\n",
      "[proc 0][Train](9/100000) average loss: 0.6926476955413818\n",
      "[proc 0][Train](9/100000) average regularization: 0.0003520946193020791\n",
      "[proc 0][Train] 1 steps take 1.427 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.195, backward: 0.002, update: 1.229\n",
      "[proc 0][Train](10/100000) average pos_loss: 0.6897681355476379\n",
      "[proc 0][Train](10/100000) average neg_loss: 0.6925088167190552\n",
      "[proc 0][Train](10/100000) average loss: 0.691138505935669\n",
      "[proc 0][Train](10/100000) average regularization: 0.00036796368658542633\n",
      "[proc 0][Train] 1 steps take 1.401 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.218, backward: 0.003, update: 1.178\n",
      "[proc 1][Train](9/100000) average pos_loss: 0.6912362575531006\n",
      "[proc 1][Train](9/100000) average neg_loss: 0.6926723122596741\n",
      "[proc 1][Train](9/100000) average loss: 0.6919542551040649\n",
      "[proc 1][Train](9/100000) average regularization: 0.0003662715607788414\n",
      "[proc 1][Train] 1 steps take 1.692 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.216, backward: 0.004, update: 1.470\n",
      "[proc 0][Train](11/100000) average pos_loss: 0.6830613017082214\n",
      "[proc 0][Train](11/100000) average neg_loss: 0.6913416981697083\n",
      "[proc 0][Train](11/100000) average loss: 0.6872014999389648\n",
      "[proc 0][Train](11/100000) average regularization: 0.00047083236859180033\n",
      "[proc 0][Train] 1 steps take 1.398 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.263, backward: 0.003, update: 1.130\n",
      "[proc 1][Train](10/100000) average pos_loss: 0.6826210618019104\n",
      "[proc 1][Train](10/100000) average neg_loss: 0.6915360689163208\n",
      "[proc 1][Train](10/100000) average loss: 0.687078595161438\n",
      "[proc 1][Train](10/100000) average regularization: 0.0004773589316755533\n",
      "[proc 1][Train] 1 steps take 1.466 seconds\n",
      "[proc 1]sample: 0.004, forward: 0.231, backward: 0.002, update: 1.229\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](12/100000) average pos_loss: 0.6508206725120544\n",
      "[proc 0][Train](12/100000) average neg_loss: 0.6905622482299805\n",
      "[proc 0][Train](12/100000) average loss: 0.6706914901733398\n",
      "[proc 0][Train](12/100000) average regularization: 0.0010013666469603777\n",
      "[proc 0][Train] 1 steps take 1.358 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.238, backward: 0.002, update: 1.116\n",
      "[proc 1][Train](11/100000) average pos_loss: 0.6513535380363464\n",
      "[proc 1][Train](11/100000) average neg_loss: 0.6900795102119446\n",
      "[proc 1][Train](11/100000) average loss: 0.6707165241241455\n",
      "[proc 1][Train](11/100000) average regularization: 0.0010082288645207882\n",
      "[proc 1][Train] 1 steps take 1.419 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.240, backward: 0.003, update: 1.175\n",
      "[proc 0][Train](13/100000) average pos_loss: 0.5983461141586304\n",
      "[proc 0][Train](13/100000) average neg_loss: 0.7247665524482727\n",
      "[proc 0][Train](13/100000) average loss: 0.6615563631057739\n",
      "[proc 0][Train](13/100000) average regularization: 0.003072525141760707\n",
      "[proc 0][Train] 1 steps take 1.369 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.250, backward: 0.002, update: 1.116\n",
      "[proc 1][Train](12/100000) average pos_loss: 0.5901622176170349\n",
      "[proc 1][Train](12/100000) average neg_loss: 0.7391148805618286\n",
      "[proc 1][Train](12/100000) average loss: 0.6646385192871094\n",
      "[proc 1][Train](12/100000) average regularization: 0.0034198507200926542\n",
      "[proc 1][Train] 1 steps take 1.415 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.217, backward: 0.003, update: 1.193\n",
      "[proc 0][Train](14/100000) average pos_loss: 0.582738995552063\n",
      "[proc 0][Train](14/100000) average neg_loss: 0.739935040473938\n",
      "[proc 0][Train](14/100000) average loss: 0.6613370180130005\n",
      "[proc 0][Train](14/100000) average regularization: 0.002228089841082692\n",
      "[proc 0][Train] 1 steps take 1.394 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.235, backward: 0.004, update: 1.155\n",
      "[proc 1][Train](13/100000) average pos_loss: 0.589806318283081\n",
      "[proc 1][Train](13/100000) average neg_loss: 0.7262145280838013\n",
      "[proc 1][Train](13/100000) average loss: 0.6580104231834412\n",
      "[proc 1][Train](13/100000) average regularization: 0.0019839603919535875\n",
      "[proc 1][Train] 1 steps take 1.398 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.179\n",
      "[proc 0][Train](15/100000) average pos_loss: 0.5803512930870056\n",
      "[proc 0][Train](15/100000) average neg_loss: 0.711631715297699\n",
      "[proc 0][Train](15/100000) average loss: 0.6459915041923523\n",
      "[proc 0][Train](15/100000) average regularization: 0.0015364495338872075\n",
      "[proc 0][Train] 1 steps take 1.338 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.220, backward: 0.018, update: 1.099\n",
      "[proc 1][Train](14/100000) average pos_loss: 0.5891938209533691\n",
      "[proc 1][Train](14/100000) average neg_loss: 0.7081109285354614\n",
      "[proc 1][Train](14/100000) average loss: 0.6486523747444153\n",
      "[proc 1][Train](14/100000) average regularization: 0.00136648491024971\n",
      "[proc 1][Train] 1 steps take 1.354 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.219, backward: 0.003, update: 1.131\n",
      "[proc 0][Train](16/100000) average pos_loss: 0.558037281036377\n",
      "[proc 0][Train](16/100000) average neg_loss: 0.7699146270751953\n",
      "[proc 0][Train](16/100000) average loss: 0.6639759540557861\n",
      "[proc 0][Train](16/100000) average regularization: 0.002176607958972454\n",
      "[proc 0][Train] 1 steps take 1.325 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.222, backward: 0.017, update: 1.084\n",
      "[proc 1][Train](15/100000) average pos_loss: 0.5559300184249878\n",
      "[proc 1][Train](15/100000) average neg_loss: 0.7540372610092163\n",
      "[proc 1][Train](15/100000) average loss: 0.654983639717102\n",
      "[proc 1][Train](15/100000) average regularization: 0.0023829902056604624\n",
      "[proc 1][Train] 1 steps take 1.358 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.217, backward: 0.003, update: 1.137\n",
      "[proc 0][Train](17/100000) average pos_loss: 0.577553927898407\n",
      "[proc 0][Train](17/100000) average neg_loss: 0.6807335615158081\n",
      "[proc 0][Train](17/100000) average loss: 0.6291437149047852\n",
      "[proc 0][Train](17/100000) average regularization: 0.0013134344480931759\n",
      "[proc 0][Train] 1 steps take 1.331 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.222, backward: 0.002, update: 1.092\n",
      "[proc 1][Train](16/100000) average pos_loss: 0.5918696522712708\n",
      "[proc 1][Train](16/100000) average neg_loss: 0.6833780407905579\n",
      "[proc 1][Train](16/100000) average loss: 0.6376238465309143\n",
      "[proc 1][Train](16/100000) average regularization: 0.0012979820603504777\n",
      "[proc 1][Train] 1 steps take 1.439 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.227, backward: 0.002, update: 1.209\n",
      "[proc 0][Train](18/100000) average pos_loss: 0.5200765132904053\n",
      "[proc 0][Train](18/100000) average neg_loss: 0.877745509147644\n",
      "[proc 0][Train](18/100000) average loss: 0.6989110112190247\n",
      "[proc 0][Train](18/100000) average regularization: 0.004179983399808407\n",
      "[proc 0][Train] 1 steps take 1.374 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.217, backward: 0.003, update: 1.138\n",
      "[proc 1][Train](17/100000) average pos_loss: 0.5099474191665649\n",
      "[proc 1][Train](17/100000) average neg_loss: 0.9801042079925537\n",
      "[proc 1][Train](17/100000) average loss: 0.7450258135795593\n",
      "[proc 1][Train](17/100000) average regularization: 0.006766993086785078\n",
      "[proc 1][Train] 1 steps take 1.431 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.248, backward: 0.002, update: 1.162\n",
      "[proc 0][Train](19/100000) average pos_loss: 0.5626301765441895\n",
      "[proc 0][Train](19/100000) average neg_loss: 0.6745151877403259\n",
      "[proc 0][Train](19/100000) average loss: 0.6185727119445801\n",
      "[proc 0][Train](19/100000) average regularization: 0.0020254391711205244\n",
      "[proc 0][Train] 1 steps take 1.260 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.040\n",
      "[proc 1][Train](18/100000) average pos_loss: 0.69443678855896\n",
      "[proc 1][Train](18/100000) average neg_loss: 0.7034400701522827\n",
      "[proc 1][Train](18/100000) average loss: 0.6989384293556213\n",
      "[proc 1][Train](18/100000) average regularization: 0.0024382020346820354\n",
      "[proc 1][Train] 1 steps take 1.340 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.212, backward: 0.003, update: 1.106\n",
      "[proc 0][Train](20/100000) average pos_loss: 0.5514714121818542\n",
      "[proc 0][Train](20/100000) average neg_loss: 0.6786143183708191\n",
      "[proc 0][Train](20/100000) average loss: 0.6150428652763367\n",
      "[proc 0][Train](20/100000) average regularization: 0.002617734717205167\n",
      "[proc 0][Train] 1 steps take 1.427 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.211\n",
      "[proc 1][Train](19/100000) average pos_loss: 0.4788726568222046\n",
      "[proc 1][Train](19/100000) average neg_loss: 0.9261736869812012\n",
      "[proc 1][Train](19/100000) average loss: 0.7025231719017029\n",
      "[proc 1][Train](19/100000) average regularization: 0.008434128016233444\n",
      "[proc 1][Train] 1 steps take 1.321 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.104\n",
      "[proc 0][Train](21/100000) average pos_loss: 0.4588044285774231\n",
      "[proc 0][Train](21/100000) average neg_loss: 0.8965135812759399\n",
      "[proc 0][Train](21/100000) average loss: 0.6776590347290039\n",
      "[proc 0][Train](21/100000) average regularization: 0.011730297468602657\n",
      "[proc 0][Train] 1 steps take 1.374 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.223, backward: 0.002, update: 1.147\n",
      "[proc 1][Train](20/100000) average pos_loss: 0.4674466848373413\n",
      "[proc 1][Train](20/100000) average neg_loss: 0.8469936847686768\n",
      "[proc 1][Train](20/100000) average loss: 0.657220184803009\n",
      "[proc 1][Train](20/100000) average regularization: 0.007184714078903198\n",
      "[proc 1][Train] 1 steps take 1.382 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.163\n",
      "[proc 0][Train](22/100000) average pos_loss: 0.48485255241394043\n",
      "[proc 0][Train](22/100000) average neg_loss: 0.7017270922660828\n",
      "[proc 0][Train](22/100000) average loss: 0.593289852142334\n",
      "[proc 0][Train](22/100000) average regularization: 0.004378851503133774\n",
      "[proc 0][Train] 1 steps take 1.335 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.219, backward: 0.002, update: 1.111\n",
      "[proc 1][Train](21/100000) average pos_loss: 0.4982823133468628\n",
      "[proc 1][Train](21/100000) average neg_loss: 0.6912356615066528\n",
      "[proc 1][Train](21/100000) average loss: 0.5947589874267578\n",
      "[proc 1][Train](21/100000) average regularization: 0.003906598314642906\n",
      "[proc 1][Train] 1 steps take 1.441 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.221\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](23/100000) average pos_loss: 0.4476853013038635\n",
      "[proc 0][Train](23/100000) average neg_loss: 0.6793304681777954\n",
      "[proc 0][Train](23/100000) average loss: 0.5635079145431519\n",
      "[proc 0][Train](23/100000) average regularization: 0.004444260615855455\n",
      "[proc 0][Train] 1 steps take 1.340 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.124\n",
      "[proc 1][Train](22/100000) average pos_loss: 0.40916746854782104\n",
      "[proc 1][Train](22/100000) average neg_loss: 0.8131246566772461\n",
      "[proc 1][Train](22/100000) average loss: 0.611146092414856\n",
      "[proc 1][Train](22/100000) average regularization: 0.005911721382290125\n",
      "[proc 1][Train] 1 steps take 1.460 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.212, backward: 0.002, update: 1.243\n",
      "[proc 0][Train](24/100000) average pos_loss: 0.3940250277519226\n",
      "[proc 0][Train](24/100000) average neg_loss: 0.832101047039032\n",
      "[proc 0][Train](24/100000) average loss: 0.6130630373954773\n",
      "[proc 0][Train](24/100000) average regularization: 0.0075819361954927444\n",
      "[proc 0][Train] 1 steps take 1.274 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.056\n",
      "[proc 1][Train](23/100000) average pos_loss: 0.38159722089767456\n",
      "[proc 1][Train](23/100000) average neg_loss: 0.7710293531417847\n",
      "[proc 1][Train](23/100000) average loss: 0.5763132572174072\n",
      "[proc 1][Train](23/100000) average regularization: 0.0062415581196546555\n",
      "[proc 1][Train] 1 steps take 1.337 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.220, backward: 0.003, update: 1.112\n",
      "[proc 0][Train](25/100000) average pos_loss: 0.3861464858055115\n",
      "[proc 0][Train](25/100000) average neg_loss: 0.711998462677002\n",
      "[proc 0][Train](25/100000) average loss: 0.5490725040435791\n",
      "[proc 0][Train](25/100000) average regularization: 0.005050015635788441\n",
      "[proc 0][Train] 1 steps take 1.294 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.202, backward: 0.003, update: 1.087\n",
      "[proc 1][Train](24/100000) average pos_loss: 0.3789352774620056\n",
      "[proc 1][Train](24/100000) average neg_loss: 0.7555485963821411\n",
      "[proc 1][Train](24/100000) average loss: 0.567241907119751\n",
      "[proc 1][Train](24/100000) average regularization: 0.005142838228493929\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.085\n",
      "[proc 0][Train](26/100000) average pos_loss: 0.3632359504699707\n",
      "[proc 0][Train](26/100000) average neg_loss: 0.7529655694961548\n",
      "[proc 0][Train](26/100000) average loss: 0.5581007599830627\n",
      "[proc 0][Train](26/100000) average regularization: 0.006726243998855352\n",
      "[proc 0][Train] 1 steps take 1.324 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.108\n",
      "[proc 1][Train](25/100000) average pos_loss: 0.3369867205619812\n",
      "[proc 1][Train](25/100000) average neg_loss: 1.1082350015640259\n",
      "[proc 1][Train](25/100000) average loss: 0.7226108312606812\n",
      "[proc 1][Train](25/100000) average regularization: 0.007146983873099089\n",
      "[proc 1][Train] 1 steps take 1.377 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.159\n",
      "[proc 0][Train](27/100000) average pos_loss: 0.3260848820209503\n",
      "[proc 0][Train](27/100000) average neg_loss: 1.1708862781524658\n",
      "[proc 0][Train](27/100000) average loss: 0.7484855651855469\n",
      "[proc 0][Train](27/100000) average regularization: 0.007353553082793951\n",
      "[proc 0][Train] 1 steps take 1.376 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.220, backward: 0.003, update: 1.152\n",
      "[proc 1][Train](26/100000) average pos_loss: 0.320802241563797\n",
      "[proc 1][Train](26/100000) average neg_loss: 0.8450573682785034\n",
      "[proc 1][Train](26/100000) average loss: 0.582929790019989\n",
      "[proc 1][Train](26/100000) average regularization: 0.007793861906975508\n",
      "[proc 1][Train] 1 steps take 1.333 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.118\n",
      "[proc 0][Train](28/100000) average pos_loss: 0.3218799829483032\n",
      "[proc 0][Train](28/100000) average neg_loss: 0.8430548906326294\n",
      "[proc 0][Train](28/100000) average loss: 0.5824674367904663\n",
      "[proc 0][Train](28/100000) average regularization: 0.01482491847127676\n",
      "[proc 0][Train] 1 steps take 1.334 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.219, backward: 0.003, update: 1.111\n",
      "[proc 1][Train](27/100000) average pos_loss: 0.3266991972923279\n",
      "[proc 1][Train](27/100000) average neg_loss: 1.4816622734069824\n",
      "[proc 1][Train](27/100000) average loss: 0.9041807651519775\n",
      "[proc 1][Train](27/100000) average regularization: 0.01412258017808199\n",
      "[proc 1][Train] 1 steps take 1.312 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](29/100000) average pos_loss: 0.3414139151573181\n",
      "[proc 0][Train](29/100000) average neg_loss: 1.311105489730835\n",
      "[proc 0][Train](29/100000) average loss: 0.8262597322463989\n",
      "[proc 0][Train](29/100000) average regularization: 0.012362174689769745\n",
      "[proc 0][Train] 1 steps take 1.309 seconds\n",
      "[proc 0]sample: 0.011, forward: 0.215, backward: 0.004, update: 1.080\n",
      "[proc 1][Train](28/100000) average pos_loss: 0.3160010576248169\n",
      "[proc 1][Train](28/100000) average neg_loss: 0.7751187682151794\n",
      "[proc 1][Train](28/100000) average loss: 0.5455598831176758\n",
      "[proc 1][Train](28/100000) average regularization: 0.007439468987286091\n",
      "[proc 1][Train] 1 steps take 1.477 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.213, backward: 0.003, update: 1.258\n",
      "[proc 0][Train](30/100000) average pos_loss: 0.2984659969806671\n",
      "[proc 0][Train](30/100000) average neg_loss: 0.8711838126182556\n",
      "[proc 0][Train](30/100000) average loss: 0.5848249197006226\n",
      "[proc 0][Train](30/100000) average regularization: 0.008050397969782352\n",
      "[proc 0][Train] 1 steps take 1.348 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.215, backward: 0.003, update: 1.128\n",
      "[proc 1][Train](29/100000) average pos_loss: 0.30564844608306885\n",
      "[proc 1][Train](29/100000) average neg_loss: 0.8149678707122803\n",
      "[proc 1][Train](29/100000) average loss: 0.5603081583976746\n",
      "[proc 1][Train](29/100000) average regularization: 0.008061453700065613\n",
      "[proc 1][Train] 1 steps take 1.283 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.213, backward: 0.003, update: 1.065\n",
      "[proc 0][Train](31/100000) average pos_loss: 0.3108474910259247\n",
      "[proc 0][Train](31/100000) average neg_loss: 0.7577338814735413\n",
      "[proc 0][Train](31/100000) average loss: 0.5342906713485718\n",
      "[proc 0][Train](31/100000) average regularization: 0.006977734621614218\n",
      "[proc 0][Train] 1 steps take 1.454 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.234\n",
      "[proc 1][Train](30/100000) average pos_loss: 0.3185853064060211\n",
      "[proc 1][Train](30/100000) average neg_loss: 0.7475799322128296\n",
      "[proc 1][Train](30/100000) average loss: 0.5330826044082642\n",
      "[proc 1][Train](30/100000) average regularization: 0.006583923473954201\n",
      "[proc 1][Train] 1 steps take 1.306 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.210, backward: 0.003, update: 1.090\n",
      "[proc 0][Train](32/100000) average pos_loss: 0.3141694962978363\n",
      "[proc 0][Train](32/100000) average neg_loss: 0.7696775197982788\n",
      "[proc 0][Train](32/100000) average loss: 0.5419235229492188\n",
      "[proc 0][Train](32/100000) average regularization: 0.006871349178254604\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.087\n",
      "[proc 1][Train](31/100000) average pos_loss: 0.2953551411628723\n",
      "[proc 1][Train](31/100000) average neg_loss: 0.6764374375343323\n",
      "[proc 1][Train](31/100000) average loss: 0.4858962893486023\n",
      "[proc 1][Train](31/100000) average regularization: 0.007554764859378338\n",
      "[proc 1][Train] 1 steps take 1.287 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.067\n",
      "[proc 0][Train](33/100000) average pos_loss: 0.2895802855491638\n",
      "[proc 0][Train](33/100000) average neg_loss: 0.7636357545852661\n",
      "[proc 0][Train](33/100000) average loss: 0.5266079902648926\n",
      "[proc 0][Train](33/100000) average regularization: 0.00805413257330656\n",
      "[proc 0][Train] 1 steps take 1.286 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.197, backward: 0.003, update: 1.069\n",
      "[proc 1][Train](32/100000) average pos_loss: 0.27860409021377563\n",
      "[proc 1][Train](32/100000) average neg_loss: 0.9084146022796631\n",
      "[proc 1][Train](32/100000) average loss: 0.593509316444397\n",
      "[proc 1][Train](32/100000) average regularization: 0.008875157684087753\n",
      "[proc 1][Train] 1 steps take 1.274 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.199, backward: 0.003, update: 1.071\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](34/100000) average pos_loss: 0.2773744463920593\n",
      "[proc 0][Train](34/100000) average neg_loss: 0.8167494535446167\n",
      "[proc 0][Train](34/100000) average loss: 0.5470619201660156\n",
      "[proc 0][Train](34/100000) average regularization: 0.008314086124300957\n",
      "[proc 0][Train] 1 steps take 1.290 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.211, backward: 0.003, update: 1.058\n",
      "[proc 1][Train](33/100000) average pos_loss: 0.3067801594734192\n",
      "[proc 1][Train](33/100000) average neg_loss: 0.6612182259559631\n",
      "[proc 1][Train](33/100000) average loss: 0.48399919271469116\n",
      "[proc 1][Train](33/100000) average regularization: 0.0068478453904390335\n",
      "[proc 1][Train] 1 steps take 1.424 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.199, backward: 0.003, update: 1.205\n",
      "[proc 0][Train](35/100000) average pos_loss: 0.3116009533405304\n",
      "[proc 0][Train](35/100000) average neg_loss: 0.6839151382446289\n",
      "[proc 0][Train](35/100000) average loss: 0.49775803089141846\n",
      "[proc 0][Train](35/100000) average regularization: 0.0066536227241158485\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.082\n",
      "[proc 1][Train](34/100000) average pos_loss: 0.2818240225315094\n",
      "[proc 1][Train](34/100000) average neg_loss: 0.8151422739028931\n",
      "[proc 1][Train](34/100000) average loss: 0.54848313331604\n",
      "[proc 1][Train](34/100000) average regularization: 0.007409223355352879\n",
      "[proc 1][Train] 1 steps take 1.267 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.214, backward: 0.004, update: 1.032\n",
      "[proc 0][Train](36/100000) average pos_loss: 0.2748883366584778\n",
      "[proc 0][Train](36/100000) average neg_loss: 0.8706997036933899\n",
      "[proc 0][Train](36/100000) average loss: 0.5727940201759338\n",
      "[proc 0][Train](36/100000) average regularization: 0.00882247369736433\n",
      "[proc 0][Train] 1 steps take 1.311 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.095\n",
      "[proc 1][Train](35/100000) average pos_loss: 0.2816397547721863\n",
      "[proc 1][Train](35/100000) average neg_loss: 0.7258291244506836\n",
      "[proc 1][Train](35/100000) average loss: 0.5037344694137573\n",
      "[proc 1][Train](35/100000) average regularization: 0.008413176983594894\n",
      "[proc 1][Train] 1 steps take 1.268 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.059\n",
      "[proc 0][Train](37/100000) average pos_loss: 0.3043424189090729\n",
      "[proc 0][Train](37/100000) average neg_loss: 0.677760660648346\n",
      "[proc 0][Train](37/100000) average loss: 0.4910515546798706\n",
      "[proc 0][Train](37/100000) average regularization: 0.007060195319354534\n",
      "[proc 0][Train] 1 steps take 1.299 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.081\n",
      "[proc 1][Train](36/100000) average pos_loss: 0.2958536744117737\n",
      "[proc 1][Train](36/100000) average neg_loss: 0.7957816123962402\n",
      "[proc 1][Train](36/100000) average loss: 0.5458176136016846\n",
      "[proc 1][Train](36/100000) average regularization: 0.007099650800228119\n",
      "[proc 1][Train] 1 steps take 1.331 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.112\n",
      "[proc 0][Train](38/100000) average pos_loss: 0.2777705192565918\n",
      "[proc 0][Train](38/100000) average neg_loss: 0.7648822665214539\n",
      "[proc 0][Train](38/100000) average loss: 0.5213264226913452\n",
      "[proc 0][Train](38/100000) average regularization: 0.007606107275933027\n",
      "[proc 0][Train] 1 steps take 1.300 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.002, update: 1.080\n",
      "[proc 1][Train](37/100000) average pos_loss: 0.28667327761650085\n",
      "[proc 1][Train](37/100000) average neg_loss: 0.6728321313858032\n",
      "[proc 1][Train](37/100000) average loss: 0.47975271940231323\n",
      "[proc 1][Train](37/100000) average regularization: 0.007736364379525185\n",
      "[proc 1][Train] 1 steps take 1.335 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.238, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](39/100000) average pos_loss: 0.284601628780365\n",
      "[proc 0][Train](39/100000) average neg_loss: 0.6475468873977661\n",
      "[proc 0][Train](39/100000) average loss: 0.46607425808906555\n",
      "[proc 0][Train](39/100000) average regularization: 0.008327472023665905\n",
      "[proc 0][Train] 1 steps take 1.318 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.099\n",
      "[proc 1][Train](38/100000) average pos_loss: 0.27955329418182373\n",
      "[proc 1][Train](38/100000) average neg_loss: 0.8632357120513916\n",
      "[proc 1][Train](38/100000) average loss: 0.5713945031166077\n",
      "[proc 1][Train](38/100000) average regularization: 0.00877811573445797\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.084\n",
      "[proc 0][Train](40/100000) average pos_loss: 0.26807546615600586\n",
      "[proc 0][Train](40/100000) average neg_loss: 0.9328826665878296\n",
      "[proc 0][Train](40/100000) average loss: 0.6004790663719177\n",
      "[proc 0][Train](40/100000) average regularization: 0.009424250572919846\n",
      "[proc 0][Train] 1 steps take 1.315 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.002, update: 1.095\n",
      "[proc 1][Train](39/100000) average pos_loss: 0.2827855348587036\n",
      "[proc 1][Train](39/100000) average neg_loss: 0.6990455389022827\n",
      "[proc 1][Train](39/100000) average loss: 0.49091553688049316\n",
      "[proc 1][Train](39/100000) average regularization: 0.00803457386791706\n",
      "[proc 1][Train] 1 steps take 1.285 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.067\n",
      "[proc 0][Train](41/100000) average pos_loss: 0.3108341693878174\n",
      "[proc 0][Train](41/100000) average neg_loss: 0.6455126404762268\n",
      "[proc 0][Train](41/100000) average loss: 0.4781734049320221\n",
      "[proc 0][Train](41/100000) average regularization: 0.006896935403347015\n",
      "[proc 0][Train] 1 steps take 1.318 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](40/100000) average pos_loss: 0.2982673943042755\n",
      "[proc 1][Train](40/100000) average neg_loss: 0.7530821561813354\n",
      "[proc 1][Train](40/100000) average loss: 0.5256747603416443\n",
      "[proc 1][Train](40/100000) average regularization: 0.007000131998211145\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.003, update: 1.090\n",
      "[proc 0][Train](42/100000) average pos_loss: 0.26770445704460144\n",
      "[proc 0][Train](42/100000) average neg_loss: 0.8241760730743408\n",
      "[proc 0][Train](42/100000) average loss: 0.5459402799606323\n",
      "[proc 0][Train](42/100000) average regularization: 0.008336784318089485\n",
      "[proc 0][Train] 1 steps take 1.280 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.061\n",
      "[proc 1][Train](41/100000) average pos_loss: 0.272880882024765\n",
      "[proc 1][Train](41/100000) average neg_loss: 0.7366232872009277\n",
      "[proc 1][Train](41/100000) average loss: 0.5047520995140076\n",
      "[proc 1][Train](41/100000) average regularization: 0.009080829098820686\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.083\n",
      "[proc 0][Train](43/100000) average pos_loss: 0.2751167118549347\n",
      "[proc 0][Train](43/100000) average neg_loss: 0.6988004446029663\n",
      "[proc 0][Train](43/100000) average loss: 0.4869585633277893\n",
      "[proc 0][Train](43/100000) average regularization: 0.008264436386525631\n",
      "[proc 0][Train] 1 steps take 1.300 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](42/100000) average pos_loss: 0.2802656888961792\n",
      "[proc 1][Train](42/100000) average neg_loss: 0.806883692741394\n",
      "[proc 1][Train](42/100000) average loss: 0.5435746908187866\n",
      "[proc 1][Train](42/100000) average regularization: 0.007861743681132793\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.096\n",
      "[proc 0][Train](44/100000) average pos_loss: 0.28017300367355347\n",
      "[proc 0][Train](44/100000) average neg_loss: 0.7832379937171936\n",
      "[proc 0][Train](44/100000) average loss: 0.5317054986953735\n",
      "[proc 0][Train](44/100000) average regularization: 0.0081067755818367\n",
      "[proc 0][Train] 1 steps take 1.297 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.002, update: 1.084\n",
      "[proc 1][Train](43/100000) average pos_loss: 0.29024386405944824\n",
      "[proc 1][Train](43/100000) average neg_loss: 0.6390112638473511\n",
      "[proc 1][Train](43/100000) average loss: 0.46462756395339966\n",
      "[proc 1][Train](43/100000) average regularization: 0.008004214614629745\n",
      "[proc 1][Train] 1 steps take 1.326 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.003, update: 1.114\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](45/100000) average pos_loss: 0.29258108139038086\n",
      "[proc 0][Train](45/100000) average neg_loss: 0.619147539138794\n",
      "[proc 0][Train](45/100000) average loss: 0.4558643102645874\n",
      "[proc 0][Train](45/100000) average regularization: 0.00802880059927702\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.225, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](44/100000) average pos_loss: 0.27796727418899536\n",
      "[proc 1][Train](44/100000) average neg_loss: 0.8408651351928711\n",
      "[proc 1][Train](44/100000) average loss: 0.5594161748886108\n",
      "[proc 1][Train](44/100000) average regularization: 0.008789388462901115\n",
      "[proc 1][Train] 1 steps take 1.295 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.003, update: 1.084\n",
      "[proc 0][Train](46/100000) average pos_loss: 0.25256311893463135\n",
      "[proc 0][Train](46/100000) average neg_loss: 0.9447612762451172\n",
      "[proc 0][Train](46/100000) average loss: 0.5986621975898743\n",
      "[proc 0][Train](46/100000) average regularization: 0.009593944996595383\n",
      "[proc 0][Train] 1 steps take 1.269 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.198, backward: 0.003, update: 1.066\n",
      "[proc 1][Train](45/100000) average pos_loss: 0.2636348605155945\n",
      "[proc 1][Train](45/100000) average neg_loss: 0.6645171642303467\n",
      "[proc 1][Train](45/100000) average loss: 0.4640760123729706\n",
      "[proc 1][Train](45/100000) average regularization: 0.008423347026109695\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.083\n",
      "[proc 0][Train](47/100000) average pos_loss: 0.3012910485267639\n",
      "[proc 0][Train](47/100000) average neg_loss: 0.6649336814880371\n",
      "[proc 0][Train](47/100000) average loss: 0.4831123650074005\n",
      "[proc 0][Train](47/100000) average regularization: 0.0071511827409267426\n",
      "[proc 0][Train] 1 steps take 1.287 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.198, backward: 0.002, update: 1.085\n",
      "[proc 1][Train](46/100000) average pos_loss: 0.2898824214935303\n",
      "[proc 1][Train](46/100000) average neg_loss: 0.7245938181877136\n",
      "[proc 1][Train](46/100000) average loss: 0.5072381496429443\n",
      "[proc 1][Train](46/100000) average regularization: 0.00754759693518281\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.091\n",
      "[proc 0][Train](48/100000) average pos_loss: 0.2866363525390625\n",
      "[proc 0][Train](48/100000) average neg_loss: 0.7749782800674438\n",
      "[proc 0][Train](48/100000) average loss: 0.5308073163032532\n",
      "[proc 0][Train](48/100000) average regularization: 0.008271102793514729\n",
      "[proc 0][Train] 1 steps take 1.316 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.217, backward: 0.002, update: 1.095\n",
      "[proc 1][Train](47/100000) average pos_loss: 0.2750670313835144\n",
      "[proc 1][Train](47/100000) average neg_loss: 0.6808366179466248\n",
      "[proc 1][Train](47/100000) average loss: 0.4779518246650696\n",
      "[proc 1][Train](47/100000) average regularization: 0.00890645757317543\n",
      "[proc 1][Train] 1 steps take 1.260 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.003, update: 1.049\n",
      "[proc 0][Train](49/100000) average pos_loss: 0.2798417806625366\n",
      "[proc 0][Train](49/100000) average neg_loss: 0.6994744539260864\n",
      "[proc 0][Train](49/100000) average loss: 0.4896581172943115\n",
      "[proc 0][Train](49/100000) average regularization: 0.008327532559633255\n",
      "[proc 0][Train] 1 steps take 1.334 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.208, backward: 0.002, update: 1.106\n",
      "[proc 1][Train](48/100000) average pos_loss: 0.26701706647872925\n",
      "[proc 1][Train](48/100000) average neg_loss: 0.8312009572982788\n",
      "[proc 1][Train](48/100000) average loss: 0.5491089820861816\n",
      "[proc 1][Train](48/100000) average regularization: 0.008314996026456356\n",
      "[proc 1][Train] 1 steps take 1.363 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.153\n",
      "[proc 0][Train](50/100000) average pos_loss: 0.2682051956653595\n",
      "[proc 0][Train](50/100000) average neg_loss: 0.8404849171638489\n",
      "[proc 0][Train](50/100000) average loss: 0.5543450713157654\n",
      "[proc 0][Train](50/100000) average regularization: 0.008469237945973873\n",
      "[proc 0][Train] 1 steps take 1.349 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.216, backward: 0.003, update: 1.114\n",
      "[proc 1][Train](49/100000) average pos_loss: 0.2928183674812317\n",
      "[proc 1][Train](49/100000) average neg_loss: 0.6019424200057983\n",
      "[proc 1][Train](49/100000) average loss: 0.447380393743515\n",
      "[proc 1][Train](49/100000) average regularization: 0.007928377017378807\n",
      "[proc 1][Train] 1 steps take 1.361 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.216, backward: 0.003, update: 1.122\n",
      "[proc 0][Train](51/100000) average pos_loss: 0.3247736394405365\n",
      "[proc 0][Train](51/100000) average neg_loss: 0.5625723600387573\n",
      "[proc 0][Train](51/100000) average loss: 0.4436730146408081\n",
      "[proc 0][Train](51/100000) average regularization: 0.00763949379324913\n",
      "[proc 0][Train] 1 steps take 1.322 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.103\n",
      "[proc 1][Train](50/100000) average pos_loss: 0.28892260789871216\n",
      "[proc 1][Train](50/100000) average neg_loss: 0.7541451454162598\n",
      "[proc 1][Train](50/100000) average loss: 0.5215338468551636\n",
      "[proc 1][Train](50/100000) average regularization: 0.008612461388111115\n",
      "[proc 1][Train] 1 steps take 1.318 seconds\n",
      "[proc 1]sample: 0.022, forward: 0.211, backward: 0.003, update: 1.082\n",
      "[proc 0][Train](52/100000) average pos_loss: 0.24945631623268127\n",
      "[proc 0][Train](52/100000) average neg_loss: 0.8786927461624146\n",
      "[proc 0][Train](52/100000) average loss: 0.5640745162963867\n",
      "[proc 0][Train](52/100000) average regularization: 0.00980543065816164\n",
      "[proc 0][Train] 1 steps take 1.298 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.212, backward: 0.003, update: 1.081\n",
      "[proc 1][Train](51/100000) average pos_loss: 0.2532973289489746\n",
      "[proc 1][Train](51/100000) average neg_loss: 0.7511858940124512\n",
      "[proc 1][Train](51/100000) average loss: 0.5022416114807129\n",
      "[proc 1][Train](51/100000) average regularization: 0.009200209751725197\n",
      "[proc 1][Train] 1 steps take 1.387 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.170\n",
      "[proc 0][Train](53/100000) average pos_loss: 0.2802135944366455\n",
      "[proc 0][Train](53/100000) average neg_loss: 0.6431285738945007\n",
      "[proc 0][Train](53/100000) average loss: 0.4616710841655731\n",
      "[proc 0][Train](53/100000) average regularization: 0.008221490308642387\n",
      "[proc 0][Train] 1 steps take 1.344 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.125\n",
      "[proc 1][Train](52/100000) average pos_loss: 0.2988084852695465\n",
      "[proc 1][Train](52/100000) average neg_loss: 0.7193869948387146\n",
      "[proc 1][Train](52/100000) average loss: 0.5090977549552917\n",
      "[proc 1][Train](52/100000) average regularization: 0.008101921528577805\n",
      "[proc 1][Train] 1 steps take 1.269 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.051\n",
      "[proc 0][Train](54/100000) average pos_loss: 0.2862038016319275\n",
      "[proc 0][Train](54/100000) average neg_loss: 0.804077684879303\n",
      "[proc 0][Train](54/100000) average loss: 0.5451407432556152\n",
      "[proc 0][Train](54/100000) average regularization: 0.008861715905368328\n",
      "[proc 0][Train] 1 steps take 1.422 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.218, backward: 0.002, update: 1.200\n",
      "[proc 1][Train](53/100000) average pos_loss: 0.28507503867149353\n",
      "[proc 1][Train](53/100000) average neg_loss: 0.6182039380073547\n",
      "[proc 1][Train](53/100000) average loss: 0.45163947343826294\n",
      "[proc 1][Train](53/100000) average regularization: 0.008985409513115883\n",
      "[proc 1][Train] 1 steps take 1.287 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.071\n",
      "[proc 0][Train](55/100000) average pos_loss: 0.27086856961250305\n",
      "[proc 0][Train](55/100000) average neg_loss: 0.5976452827453613\n",
      "[proc 0][Train](55/100000) average loss: 0.434256911277771\n",
      "[proc 0][Train](55/100000) average regularization: 0.008651936426758766\n",
      "[proc 0][Train] 1 steps take 1.297 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.089\n",
      "[proc 1][Train](54/100000) average pos_loss: 0.2590494155883789\n",
      "[proc 1][Train](54/100000) average neg_loss: 0.833499550819397\n",
      "[proc 1][Train](54/100000) average loss: 0.5462744832038879\n",
      "[proc 1][Train](54/100000) average regularization: 0.00904951710253954\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.093\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](56/100000) average pos_loss: 0.2548777163028717\n",
      "[proc 0][Train](56/100000) average neg_loss: 0.8347815275192261\n",
      "[proc 0][Train](56/100000) average loss: 0.5448296070098877\n",
      "[proc 0][Train](56/100000) average regularization: 0.009755194187164307\n",
      "[proc 0][Train] 1 steps take 1.325 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.107\n",
      "[proc 1][Train](55/100000) average pos_loss: 0.28476518392562866\n",
      "[proc 1][Train](55/100000) average neg_loss: 0.6045381426811218\n",
      "[proc 1][Train](55/100000) average loss: 0.44465166330337524\n",
      "[proc 1][Train](55/100000) average regularization: 0.00903304759413004\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.098\n",
      "[proc 0][Train](57/100000) average pos_loss: 0.3163188099861145\n",
      "[proc 0][Train](57/100000) average neg_loss: 0.5710470080375671\n",
      "[proc 0][Train](57/100000) average loss: 0.4436829090118408\n",
      "[proc 0][Train](57/100000) average regularization: 0.008530212566256523\n",
      "[proc 0][Train] 1 steps take 1.299 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.217, backward: 0.003, update: 1.076\n",
      "[proc 1][Train](56/100000) average pos_loss: 0.29197463393211365\n",
      "[proc 1][Train](56/100000) average neg_loss: 0.7416468858718872\n",
      "[proc 1][Train](56/100000) average loss: 0.5168107748031616\n",
      "[proc 1][Train](56/100000) average regularization: 0.008970669470727444\n",
      "[proc 1][Train] 1 steps take 1.298 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.078\n",
      "[proc 0][Train](58/100000) average pos_loss: 0.25718826055526733\n",
      "[proc 0][Train](58/100000) average neg_loss: 0.8004170656204224\n",
      "[proc 0][Train](58/100000) average loss: 0.5288026332855225\n",
      "[proc 0][Train](58/100000) average regularization: 0.009672362357378006\n",
      "[proc 0][Train] 1 steps take 1.323 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.113\n",
      "[proc 1][Train](57/100000) average pos_loss: 0.2591812312602997\n",
      "[proc 1][Train](57/100000) average neg_loss: 0.6347087025642395\n",
      "[proc 1][Train](57/100000) average loss: 0.4469449520111084\n",
      "[proc 1][Train](57/100000) average regularization: 0.00934869796037674\n",
      "[proc 1][Train] 1 steps take 1.320 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.105\n",
      "[proc 0][Train](59/100000) average pos_loss: 0.27911263704299927\n",
      "[proc 0][Train](59/100000) average neg_loss: 0.6042617559432983\n",
      "[proc 0][Train](59/100000) average loss: 0.4416871964931488\n",
      "[proc 0][Train](59/100000) average regularization: 0.008920452557504177\n",
      "[proc 0][Train] 1 steps take 1.278 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.204, backward: 0.003, update: 1.070\n",
      "[proc 1][Train](58/100000) average pos_loss: 0.28373751044273376\n",
      "[proc 1][Train](58/100000) average neg_loss: 0.7894079685211182\n",
      "[proc 1][Train](58/100000) average loss: 0.5365727543830872\n",
      "[proc 1][Train](58/100000) average regularization: 0.00917267706245184\n",
      "[proc 1][Train] 1 steps take 1.353 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.132\n",
      "[proc 0][Train](60/100000) average pos_loss: 0.28245657682418823\n",
      "[proc 0][Train](60/100000) average neg_loss: 0.7818961143493652\n",
      "[proc 0][Train](60/100000) average loss: 0.5321763753890991\n",
      "[proc 0][Train](60/100000) average regularization: 0.010045584291219711\n",
      "[proc 0][Train] 1 steps take 1.296 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.080\n",
      "[proc 1][Train](59/100000) average pos_loss: 0.28411799669265747\n",
      "[proc 1][Train](59/100000) average neg_loss: 0.6917667388916016\n",
      "[proc 1][Train](59/100000) average loss: 0.4879423677921295\n",
      "[proc 1][Train](59/100000) average regularization: 0.009570294991135597\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.094\n",
      "[proc 0][Train](61/100000) average pos_loss: 0.28460368514060974\n",
      "[proc 0][Train](61/100000) average neg_loss: 0.5581204891204834\n",
      "[proc 0][Train](61/100000) average loss: 0.42136210203170776\n",
      "[proc 0][Train](61/100000) average regularization: 0.009099995717406273\n",
      "[proc 0][Train] 1 steps take 1.474 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.002, update: 1.261\n",
      "[proc 1][Train](60/100000) average pos_loss: 0.28275972604751587\n",
      "[proc 1][Train](60/100000) average neg_loss: 0.7198110222816467\n",
      "[proc 1][Train](60/100000) average loss: 0.5012853741645813\n",
      "[proc 1][Train](60/100000) average regularization: 0.008804702199995518\n",
      "[proc 1][Train] 1 steps take 1.336 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.116\n",
      "[proc 0][Train](62/100000) average pos_loss: 0.2701879143714905\n",
      "[proc 0][Train](62/100000) average neg_loss: 0.7884433269500732\n",
      "[proc 0][Train](62/100000) average loss: 0.5293155908584595\n",
      "[proc 0][Train](62/100000) average regularization: 0.009528180584311485\n",
      "[proc 0][Train] 1 steps take 1.268 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.059\n",
      "[proc 1][Train](61/100000) average pos_loss: 0.26919490098953247\n",
      "[proc 1][Train](61/100000) average neg_loss: 0.5934640765190125\n",
      "[proc 1][Train](61/100000) average loss: 0.43132948875427246\n",
      "[proc 1][Train](61/100000) average regularization: 0.009641707874834538\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.072\n",
      "[proc 0][Train](63/100000) average pos_loss: 0.28883999586105347\n",
      "[proc 0][Train](63/100000) average neg_loss: 0.6274542808532715\n",
      "[proc 0][Train](63/100000) average loss: 0.4581471383571625\n",
      "[proc 0][Train](63/100000) average regularization: 0.00956043228507042\n",
      "[proc 0][Train] 1 steps take 1.293 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.075\n",
      "[proc 1][Train](62/100000) average pos_loss: 0.27898383140563965\n",
      "[proc 1][Train](62/100000) average neg_loss: 0.7807254791259766\n",
      "[proc 1][Train](62/100000) average loss: 0.5298546552658081\n",
      "[proc 1][Train](62/100000) average regularization: 0.009837111458182335\n",
      "[proc 1][Train] 1 steps take 1.296 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.193, backward: 0.003, update: 1.098\n",
      "[proc 0][Train](64/100000) average pos_loss: 0.26238179206848145\n",
      "[proc 0][Train](64/100000) average neg_loss: 0.785466730594635\n",
      "[proc 0][Train](64/100000) average loss: 0.5239242315292358\n",
      "[proc 0][Train](64/100000) average regularization: 0.009988432750105858\n",
      "[proc 0][Train] 1 steps take 1.310 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.092\n",
      "[proc 1][Train](63/100000) average pos_loss: 0.28688347339630127\n",
      "[proc 1][Train](63/100000) average neg_loss: 0.5874983072280884\n",
      "[proc 1][Train](63/100000) average loss: 0.4371908903121948\n",
      "[proc 1][Train](63/100000) average regularization: 0.009412640705704689\n",
      "[proc 1][Train] 1 steps take 1.310 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.091\n",
      "[proc 0][Train](65/100000) average pos_loss: 0.3002398908138275\n",
      "[proc 0][Train](65/100000) average neg_loss: 0.5762298107147217\n",
      "[proc 0][Train](65/100000) average loss: 0.4382348656654358\n",
      "[proc 0][Train](65/100000) average regularization: 0.00907126534730196\n",
      "[proc 0][Train] 1 steps take 1.377 seconds\n",
      "[proc 0]sample: 0.019, forward: 0.216, backward: 0.003, update: 1.139\n",
      "[proc 1][Train](64/100000) average pos_loss: 0.2805269658565521\n",
      "[proc 1][Train](64/100000) average neg_loss: 0.7380204200744629\n",
      "[proc 1][Train](64/100000) average loss: 0.5092737078666687\n",
      "[proc 1][Train](64/100000) average regularization: 0.009573737159371376\n",
      "[proc 1][Train] 1 steps take 1.459 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.239\n",
      "[proc 0][Train](66/100000) average pos_loss: 0.262273907661438\n",
      "[proc 0][Train](66/100000) average neg_loss: 0.802690863609314\n",
      "[proc 0][Train](66/100000) average loss: 0.532482385635376\n",
      "[proc 0][Train](66/100000) average regularization: 0.010383549146354198\n",
      "[proc 0][Train] 1 steps take 1.262 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.205, backward: 0.002, update: 1.037\n",
      "[proc 1][Train](65/100000) average pos_loss: 0.2706282138824463\n",
      "[proc 1][Train](65/100000) average neg_loss: 0.6140220165252686\n",
      "[proc 1][Train](65/100000) average loss: 0.4423251152038574\n",
      "[proc 1][Train](65/100000) average regularization: 0.00995826069265604\n",
      "[proc 1][Train] 1 steps take 1.322 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.205, backward: 0.003, update: 1.095\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](67/100000) average pos_loss: 0.29097458720207214\n",
      "[proc 0][Train](67/100000) average neg_loss: 0.5569337606430054\n",
      "[proc 0][Train](67/100000) average loss: 0.42395418882369995\n",
      "[proc 0][Train](67/100000) average regularization: 0.009553156793117523\n",
      "[proc 0][Train] 1 steps take 1.408 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.188\n",
      "[proc 1][Train](66/100000) average pos_loss: 0.28499454259872437\n",
      "[proc 1][Train](66/100000) average neg_loss: 0.6870698928833008\n",
      "[proc 1][Train](66/100000) average loss: 0.4860322177410126\n",
      "[proc 1][Train](66/100000) average regularization: 0.009517766535282135\n",
      "[proc 1][Train] 1 steps take 1.296 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.204, backward: 0.003, update: 1.071\n",
      "[proc 0][Train](68/100000) average pos_loss: 0.2751612067222595\n",
      "[proc 0][Train](68/100000) average neg_loss: 0.7693968415260315\n",
      "[proc 0][Train](68/100000) average loss: 0.5222790241241455\n",
      "[proc 0][Train](68/100000) average regularization: 0.010188371874392033\n",
      "[proc 0][Train] 1 steps take 1.320 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.105\n",
      "[proc 1][Train](67/100000) average pos_loss: 0.26704341173171997\n",
      "[proc 1][Train](67/100000) average neg_loss: 0.584918200969696\n",
      "[proc 1][Train](67/100000) average loss: 0.425980806350708\n",
      "[proc 1][Train](67/100000) average regularization: 0.010590163990855217\n",
      "[proc 1][Train] 1 steps take 1.309 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.202, backward: 0.003, update: 1.103\n",
      "[proc 0][Train](69/100000) average pos_loss: 0.29311567544937134\n",
      "[proc 0][Train](69/100000) average neg_loss: 0.5981842279434204\n",
      "[proc 0][Train](69/100000) average loss: 0.4456499516963959\n",
      "[proc 0][Train](69/100000) average regularization: 0.010130650363862514\n",
      "[proc 0][Train] 1 steps take 1.311 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.205, backward: 0.002, update: 1.103\n",
      "[proc 1][Train](68/100000) average pos_loss: 0.28289902210235596\n",
      "[proc 1][Train](68/100000) average neg_loss: 0.8209675550460815\n",
      "[proc 1][Train](68/100000) average loss: 0.5519332885742188\n",
      "[proc 1][Train](68/100000) average regularization: 0.010380244813859463\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.197, backward: 0.003, update: 1.100\n",
      "[proc 0][Train](70/100000) average pos_loss: 0.27323630452156067\n",
      "[proc 0][Train](70/100000) average neg_loss: 0.8756265640258789\n",
      "[proc 0][Train](70/100000) average loss: 0.5744314193725586\n",
      "[proc 0][Train](70/100000) average regularization: 0.010681853629648685\n",
      "[proc 0][Train] 1 steps take 1.300 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](69/100000) average pos_loss: 0.28510782122612\n",
      "[proc 1][Train](69/100000) average neg_loss: 0.5391882658004761\n",
      "[proc 1][Train](69/100000) average loss: 0.41214805841445923\n",
      "[proc 1][Train](69/100000) average regularization: 0.009896463714540005\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.089\n",
      "[proc 0][Train](71/100000) average pos_loss: 0.3117610812187195\n",
      "[proc 0][Train](71/100000) average neg_loss: 0.5306901335716248\n",
      "[proc 0][Train](71/100000) average loss: 0.4212256073951721\n",
      "[proc 0][Train](71/100000) average regularization: 0.009087521582841873\n",
      "[proc 0][Train] 1 steps take 1.313 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.211, backward: 0.004, update: 1.096\n",
      "[proc 1][Train](70/100000) average pos_loss: 0.2794485092163086\n",
      "[proc 1][Train](70/100000) average neg_loss: 0.7307336330413818\n",
      "[proc 1][Train](70/100000) average loss: 0.5050910711288452\n",
      "[proc 1][Train](70/100000) average regularization: 0.009705641306936741\n",
      "[proc 1][Train] 1 steps take 1.302 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.086\n",
      "[proc 0][Train](72/100000) average pos_loss: 0.2613339126110077\n",
      "[proc 0][Train](72/100000) average neg_loss: 0.8371117115020752\n",
      "[proc 0][Train](72/100000) average loss: 0.5492228269577026\n",
      "[proc 0][Train](72/100000) average regularization: 0.01099624764174223\n",
      "[proc 0][Train] 1 steps take 1.298 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.004, update: 1.081\n",
      "[proc 1][Train](71/100000) average pos_loss: 0.2772371172904968\n",
      "[proc 1][Train](71/100000) average neg_loss: 0.5936431884765625\n",
      "[proc 1][Train](71/100000) average loss: 0.43544015288352966\n",
      "[proc 1][Train](71/100000) average regularization: 0.010726186446845531\n",
      "[proc 1][Train] 1 steps take 1.290 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.199, backward: 0.003, update: 1.087\n",
      "[proc 0][Train](73/100000) average pos_loss: 0.306700736284256\n",
      "[proc 0][Train](73/100000) average neg_loss: 0.5419918894767761\n",
      "[proc 0][Train](73/100000) average loss: 0.42434632778167725\n",
      "[proc 0][Train](73/100000) average regularization: 0.009925728663802147\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.004, update: 1.084\n",
      "[proc 1][Train](72/100000) average pos_loss: 0.2936839461326599\n",
      "[proc 1][Train](72/100000) average neg_loss: 0.7254577875137329\n",
      "[proc 1][Train](72/100000) average loss: 0.509570837020874\n",
      "[proc 1][Train](72/100000) average regularization: 0.010126481764018536\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.098\n",
      "[proc 0][Train](74/100000) average pos_loss: 0.27377671003341675\n",
      "[proc 0][Train](74/100000) average neg_loss: 0.7543684244155884\n",
      "[proc 0][Train](74/100000) average loss: 0.5140725374221802\n",
      "[proc 0][Train](74/100000) average regularization: 0.010651636868715286\n",
      "[proc 0][Train] 1 steps take 1.293 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.073\n",
      "[proc 1][Train](73/100000) average pos_loss: 0.2747766375541687\n",
      "[proc 1][Train](73/100000) average neg_loss: 0.5864071846008301\n",
      "[proc 1][Train](73/100000) average loss: 0.4305919110774994\n",
      "[proc 1][Train](73/100000) average regularization: 0.010661495849490166\n",
      "[proc 1][Train] 1 steps take 1.321 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.100\n",
      "[proc 0][Train](75/100000) average pos_loss: 0.2878846526145935\n",
      "[proc 0][Train](75/100000) average neg_loss: 0.5267090201377869\n",
      "[proc 0][Train](75/100000) average loss: 0.4072968363761902\n",
      "[proc 0][Train](75/100000) average regularization: 0.010669843293726444\n",
      "[proc 0][Train] 1 steps take 1.330 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.004, update: 1.111\n",
      "[proc 1][Train](74/100000) average pos_loss: 0.2848197817802429\n",
      "[proc 1][Train](74/100000) average neg_loss: 0.7150431275367737\n",
      "[proc 1][Train](74/100000) average loss: 0.4999314546585083\n",
      "[proc 1][Train](74/100000) average regularization: 0.010695883072912693\n",
      "[proc 1][Train] 1 steps take 1.310 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.090\n",
      "[proc 0][Train](76/100000) average pos_loss: 0.27276384830474854\n",
      "[proc 0][Train](76/100000) average neg_loss: 0.7988766431808472\n",
      "[proc 0][Train](76/100000) average loss: 0.5358202457427979\n",
      "[proc 0][Train](76/100000) average regularization: 0.011399434879422188\n",
      "[proc 0][Train] 1 steps take 1.308 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.247, backward: 0.003, update: 1.056\n",
      "[proc 1][Train](75/100000) average pos_loss: 0.2836211323738098\n",
      "[proc 1][Train](75/100000) average neg_loss: 0.5066655874252319\n",
      "[proc 1][Train](75/100000) average loss: 0.3951433598995209\n",
      "[proc 1][Train](75/100000) average regularization: 0.010875758714973927\n",
      "[proc 1][Train] 1 steps take 1.312 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.212, backward: 0.003, update: 1.096\n",
      "[proc 0][Train](77/100000) average pos_loss: 0.3102602958679199\n",
      "[proc 0][Train](77/100000) average neg_loss: 0.5010713934898376\n",
      "[proc 0][Train](77/100000) average loss: 0.4056658446788788\n",
      "[proc 0][Train](77/100000) average regularization: 0.01042183954268694\n",
      "[proc 0][Train] 1 steps take 1.337 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.005, update: 1.119\n",
      "[proc 1][Train](76/100000) average pos_loss: 0.29264765977859497\n",
      "[proc 1][Train](76/100000) average neg_loss: 0.7402835488319397\n",
      "[proc 1][Train](76/100000) average loss: 0.5164656043052673\n",
      "[proc 1][Train](76/100000) average regularization: 0.010930093005299568\n",
      "[proc 1][Train] 1 steps take 1.321 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.101\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](78/100000) average pos_loss: 0.2716003656387329\n",
      "[proc 0][Train](78/100000) average neg_loss: 0.8460241556167603\n",
      "[proc 0][Train](78/100000) average loss: 0.5588122606277466\n",
      "[proc 0][Train](78/100000) average regularization: 0.011731269769370556\n",
      "[proc 0][Train] 1 steps take 1.324 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.004, update: 1.103\n",
      "[proc 1][Train](77/100000) average pos_loss: 0.2804516553878784\n",
      "[proc 1][Train](77/100000) average neg_loss: 0.5565411448478699\n",
      "[proc 1][Train](77/100000) average loss: 0.41849640011787415\n",
      "[proc 1][Train](77/100000) average regularization: 0.011575078591704369\n",
      "[proc 1][Train] 1 steps take 1.318 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.220, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](79/100000) average pos_loss: 0.31088173389434814\n",
      "[proc 0][Train](79/100000) average neg_loss: 0.5308533906936646\n",
      "[proc 0][Train](79/100000) average loss: 0.42086756229400635\n",
      "[proc 0][Train](79/100000) average regularization: 0.010542678646743298\n",
      "[proc 0][Train] 1 steps take 1.309 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.091\n",
      "[proc 1][Train](78/100000) average pos_loss: 0.3016921281814575\n",
      "[proc 1][Train](78/100000) average neg_loss: 0.6741118431091309\n",
      "[proc 1][Train](78/100000) average loss: 0.4879019856452942\n",
      "[proc 1][Train](78/100000) average regularization: 0.010723462328314781\n",
      "[proc 1][Train] 1 steps take 1.280 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.195, backward: 0.002, update: 1.082\n",
      "[proc 0][Train](80/100000) average pos_loss: 0.2779923677444458\n",
      "[proc 0][Train](80/100000) average neg_loss: 0.7714564800262451\n",
      "[proc 0][Train](80/100000) average loss: 0.5247244238853455\n",
      "[proc 0][Train](80/100000) average regularization: 0.011256935074925423\n",
      "[proc 0][Train] 1 steps take 1.299 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.082\n",
      "[proc 1][Train](79/100000) average pos_loss: 0.27491214871406555\n",
      "[proc 1][Train](79/100000) average neg_loss: 0.5529765486717224\n",
      "[proc 1][Train](79/100000) average loss: 0.4139443635940552\n",
      "[proc 1][Train](79/100000) average regularization: 0.011257771402597427\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.218, backward: 0.003, update: 1.089\n",
      "[proc 0][Train](81/100000) average pos_loss: 0.29820892214775085\n",
      "[proc 0][Train](81/100000) average neg_loss: 0.605862021446228\n",
      "[proc 0][Train](81/100000) average loss: 0.45203548669815063\n",
      "[proc 0][Train](81/100000) average regularization: 0.010854857973754406\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.210, backward: 0.003, update: 1.076\n",
      "[proc 1][Train](80/100000) average pos_loss: 0.27921682596206665\n",
      "[proc 1][Train](80/100000) average neg_loss: 0.7130680084228516\n",
      "[proc 1][Train](80/100000) average loss: 0.4961424171924591\n",
      "[proc 1][Train](80/100000) average regularization: 0.011261598207056522\n",
      "[proc 1][Train] 1 steps take 1.318 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.218, backward: 0.002, update: 1.096\n",
      "[proc 0][Train](82/100000) average pos_loss: 0.2808294892311096\n",
      "[proc 0][Train](82/100000) average neg_loss: 0.731411337852478\n",
      "[proc 0][Train](82/100000) average loss: 0.5061204433441162\n",
      "[proc 0][Train](82/100000) average regularization: 0.011393871158361435\n",
      "[proc 0][Train] 1 steps take 1.307 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.207, backward: 0.003, update: 1.080\n",
      "[proc 1][Train](81/100000) average pos_loss: 0.2943955659866333\n",
      "[proc 1][Train](81/100000) average neg_loss: 0.4594522714614868\n",
      "[proc 1][Train](81/100000) average loss: 0.37692391872406006\n",
      "[proc 1][Train](81/100000) average regularization: 0.011143430136144161\n",
      "[proc 1][Train] 1 steps take 1.321 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.209, backward: 0.002, update: 1.094\n",
      "[proc 0][Train](83/100000) average pos_loss: 0.3113289177417755\n",
      "[proc 0][Train](83/100000) average neg_loss: 0.5770747065544128\n",
      "[proc 0][Train](83/100000) average loss: 0.44420182704925537\n",
      "[proc 0][Train](83/100000) average regularization: 0.010799822397530079\n",
      "[proc 0][Train] 1 steps take 1.299 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.088\n",
      "[proc 1][Train](82/100000) average pos_loss: 0.28218764066696167\n",
      "[proc 1][Train](82/100000) average neg_loss: 0.7481749057769775\n",
      "[proc 1][Train](82/100000) average loss: 0.515181303024292\n",
      "[proc 1][Train](82/100000) average regularization: 0.011607516556978226\n",
      "[proc 1][Train] 1 steps take 1.322 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.210, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](84/100000) average pos_loss: 0.25974327325820923\n",
      "[proc 0][Train](84/100000) average neg_loss: 0.7784589529037476\n",
      "[proc 0][Train](84/100000) average loss: 0.5191011428833008\n",
      "[proc 0][Train](84/100000) average regularization: 0.011858101934194565\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](83/100000) average pos_loss: 0.28299498558044434\n",
      "[proc 1][Train](83/100000) average neg_loss: 0.5051031708717346\n",
      "[proc 1][Train](83/100000) average loss: 0.3940490782260895\n",
      "[proc 1][Train](83/100000) average regularization: 0.011398483999073505\n",
      "[proc 1][Train] 1 steps take 1.349 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.130\n",
      "[proc 0][Train](85/100000) average pos_loss: 0.31368786096572876\n",
      "[proc 0][Train](85/100000) average neg_loss: 0.5020474195480347\n",
      "[proc 0][Train](85/100000) average loss: 0.4078676402568817\n",
      "[proc 0][Train](85/100000) average regularization: 0.010927364230155945\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.088\n",
      "[proc 1][Train](84/100000) average pos_loss: 0.30118897557258606\n",
      "[proc 1][Train](84/100000) average neg_loss: 0.6706031560897827\n",
      "[proc 1][Train](84/100000) average loss: 0.4858960509300232\n",
      "[proc 1][Train](84/100000) average regularization: 0.011239948682487011\n",
      "[proc 1][Train] 1 steps take 1.400 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.272, backward: 0.003, update: 1.124\n",
      "[proc 0][Train](86/100000) average pos_loss: 0.2708647847175598\n",
      "[proc 0][Train](86/100000) average neg_loss: 0.7994887232780457\n",
      "[proc 0][Train](86/100000) average loss: 0.5351767539978027\n",
      "[proc 0][Train](86/100000) average regularization: 0.011922640725970268\n",
      "[proc 0][Train] 1 steps take 1.307 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.195, backward: 0.002, update: 1.108\n",
      "[proc 1][Train](85/100000) average pos_loss: 0.2790030241012573\n",
      "[proc 1][Train](85/100000) average neg_loss: 0.5201400518417358\n",
      "[proc 1][Train](85/100000) average loss: 0.3995715379714966\n",
      "[proc 1][Train](85/100000) average regularization: 0.01197664812207222\n",
      "[proc 1][Train] 1 steps take 1.273 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.066\n",
      "[proc 0][Train](87/100000) average pos_loss: 0.2972337603569031\n",
      "[proc 0][Train](87/100000) average neg_loss: 0.5255727767944336\n",
      "[proc 0][Train](87/100000) average loss: 0.41140326857566833\n",
      "[proc 0][Train](87/100000) average regularization: 0.011330883949995041\n",
      "[proc 0][Train] 1 steps take 1.293 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.194, backward: 0.003, update: 1.095\n",
      "[proc 1][Train](86/100000) average pos_loss: 0.29534873366355896\n",
      "[proc 1][Train](86/100000) average neg_loss: 0.7451474070549011\n",
      "[proc 1][Train](86/100000) average loss: 0.5202480554580688\n",
      "[proc 1][Train](86/100000) average regularization: 0.0115291066467762\n",
      "[proc 1][Train] 1 steps take 1.290 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.072\n",
      "[proc 0][Train](88/100000) average pos_loss: 0.2745036482810974\n",
      "[proc 0][Train](88/100000) average neg_loss: 0.7247521877288818\n",
      "[proc 0][Train](88/100000) average loss: 0.4996279180049896\n",
      "[proc 0][Train](88/100000) average regularization: 0.012027198448777199\n",
      "[proc 0][Train] 1 steps take 1.305 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.197, backward: 0.002, update: 1.105\n",
      "[proc 1][Train](87/100000) average pos_loss: 0.2973411977291107\n",
      "[proc 1][Train](87/100000) average neg_loss: 0.4945133924484253\n",
      "[proc 1][Train](87/100000) average loss: 0.3959273099899292\n",
      "[proc 1][Train](87/100000) average regularization: 0.011565285734832287\n",
      "[proc 1][Train] 1 steps take 1.292 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.075\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](89/100000) average pos_loss: 0.31167203187942505\n",
      "[proc 0][Train](89/100000) average neg_loss: 0.46996036171913147\n",
      "[proc 0][Train](89/100000) average loss: 0.39081621170043945\n",
      "[proc 0][Train](89/100000) average regularization: 0.011424211785197258\n",
      "[proc 0][Train] 1 steps take 1.315 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.099\n",
      "[proc 1][Train](88/100000) average pos_loss: 0.29164403676986694\n",
      "[proc 1][Train](88/100000) average neg_loss: 0.6933490633964539\n",
      "[proc 1][Train](88/100000) average loss: 0.4924965500831604\n",
      "[proc 1][Train](88/100000) average regularization: 0.01178573165088892\n",
      "[proc 1][Train] 1 steps take 1.306 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.213, backward: 0.003, update: 1.088\n",
      "[proc 0][Train](90/100000) average pos_loss: 0.25890952348709106\n",
      "[proc 0][Train](90/100000) average neg_loss: 0.8033853769302368\n",
      "[proc 0][Train](90/100000) average loss: 0.5311474800109863\n",
      "[proc 0][Train](90/100000) average regularization: 0.012841551564633846\n",
      "[proc 0][Train] 1 steps take 1.300 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.082\n",
      "[proc 1][Train](89/100000) average pos_loss: 0.2781186103820801\n",
      "[proc 1][Train](89/100000) average neg_loss: 0.5033689737319946\n",
      "[proc 1][Train](89/100000) average loss: 0.39074379205703735\n",
      "[proc 1][Train](89/100000) average regularization: 0.012393327429890633\n",
      "[proc 1][Train] 1 steps take 1.288 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.073\n",
      "[proc 0][Train](91/100000) average pos_loss: 0.30992329120635986\n",
      "[proc 0][Train](91/100000) average neg_loss: 0.4678530693054199\n",
      "[proc 0][Train](91/100000) average loss: 0.3888881802558899\n",
      "[proc 0][Train](91/100000) average regularization: 0.011501245200634003\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.087\n",
      "[proc 1][Train](90/100000) average pos_loss: 0.29834264516830444\n",
      "[proc 1][Train](90/100000) average neg_loss: 0.711689829826355\n",
      "[proc 1][Train](90/100000) average loss: 0.5050162076950073\n",
      "[proc 1][Train](90/100000) average regularization: 0.01198515109717846\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](92/100000) average pos_loss: 0.274219274520874\n",
      "[proc 0][Train](92/100000) average neg_loss: 0.7965843081474304\n",
      "[proc 0][Train](92/100000) average loss: 0.5354018211364746\n",
      "[proc 0][Train](92/100000) average regularization: 0.012825852259993553\n",
      "[proc 0][Train] 1 steps take 1.350 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.131\n",
      "[proc 1][Train](91/100000) average pos_loss: 0.28658947348594666\n",
      "[proc 1][Train](91/100000) average neg_loss: 0.5399965047836304\n",
      "[proc 1][Train](91/100000) average loss: 0.4132930040359497\n",
      "[proc 1][Train](91/100000) average regularization: 0.012356560677289963\n",
      "[proc 1][Train] 1 steps take 1.310 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](93/100000) average pos_loss: 0.31823715567588806\n",
      "[proc 0][Train](93/100000) average neg_loss: 0.5718573331832886\n",
      "[proc 0][Train](93/100000) average loss: 0.4450472593307495\n",
      "[proc 0][Train](93/100000) average regularization: 0.011667026206851006\n",
      "[proc 0][Train] 1 steps take 1.309 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.091\n",
      "[proc 1][Train](92/100000) average pos_loss: 0.3063080906867981\n",
      "[proc 1][Train](92/100000) average neg_loss: 0.6464139223098755\n",
      "[proc 1][Train](92/100000) average loss: 0.4763610064983368\n",
      "[proc 1][Train](92/100000) average regularization: 0.011899074539542198\n",
      "[proc 1][Train] 1 steps take 1.298 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.085\n",
      "[proc 0][Train](94/100000) average pos_loss: 0.3054842948913574\n",
      "[proc 0][Train](94/100000) average neg_loss: 0.6785709857940674\n",
      "[proc 0][Train](94/100000) average loss: 0.4920276403427124\n",
      "[proc 0][Train](94/100000) average regularization: 0.01206947397440672\n",
      "[proc 0][Train] 1 steps take 1.312 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.098\n",
      "[proc 1][Train](93/100000) average pos_loss: 0.2904987931251526\n",
      "[proc 1][Train](93/100000) average neg_loss: 0.4399801194667816\n",
      "[proc 1][Train](93/100000) average loss: 0.3652394413948059\n",
      "[proc 1][Train](93/100000) average regularization: 0.012390488758683205\n",
      "[proc 1][Train] 1 steps take 1.279 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.070\n",
      "[proc 0][Train](95/100000) average pos_loss: 0.29938244819641113\n",
      "[proc 0][Train](95/100000) average neg_loss: 0.46043679118156433\n",
      "[proc 0][Train](95/100000) average loss: 0.3799096345901489\n",
      "[proc 0][Train](95/100000) average regularization: 0.012180902063846588\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.090\n",
      "[proc 1][Train](94/100000) average pos_loss: 0.27760398387908936\n",
      "[proc 1][Train](94/100000) average neg_loss: 0.8023048639297485\n",
      "[proc 1][Train](94/100000) average loss: 0.539954423904419\n",
      "[proc 1][Train](94/100000) average regularization: 0.012657523155212402\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.085\n",
      "[proc 0][Train](96/100000) average pos_loss: 0.25973349809646606\n",
      "[proc 0][Train](96/100000) average neg_loss: 0.8179674744606018\n",
      "[proc 0][Train](96/100000) average loss: 0.5388504862785339\n",
      "[proc 0][Train](96/100000) average regularization: 0.013434968888759613\n",
      "[proc 0][Train] 1 steps take 1.305 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.090\n",
      "[proc 1][Train](95/100000) average pos_loss: 0.28785794973373413\n",
      "[proc 1][Train](95/100000) average neg_loss: 0.4790973663330078\n",
      "[proc 1][Train](95/100000) average loss: 0.38347765803337097\n",
      "[proc 1][Train](95/100000) average regularization: 0.012473014183342457\n",
      "[proc 1][Train] 1 steps take 1.305 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.087\n",
      "[proc 0][Train](97/100000) average pos_loss: 0.32868725061416626\n",
      "[proc 0][Train](97/100000) average neg_loss: 0.4287509322166443\n",
      "[proc 0][Train](97/100000) average loss: 0.3787190914154053\n",
      "[proc 0][Train](97/100000) average regularization: 0.011914161965250969\n",
      "[proc 0][Train] 1 steps take 1.331 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.211, backward: 0.003, update: 1.100\n",
      "[proc 1][Train](96/100000) average pos_loss: 0.3106125593185425\n",
      "[proc 1][Train](96/100000) average neg_loss: 0.6734475493431091\n",
      "[proc 1][Train](96/100000) average loss: 0.4920300543308258\n",
      "[proc 1][Train](96/100000) average regularization: 0.012161512859165668\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.087\n",
      "[proc 0][Train](98/100000) average pos_loss: 0.27859780192375183\n",
      "[proc 0][Train](98/100000) average neg_loss: 0.7590566873550415\n",
      "[proc 0][Train](98/100000) average loss: 0.5188272595405579\n",
      "[proc 0][Train](98/100000) average regularization: 0.012971464544534683\n",
      "[proc 0][Train] 1 steps take 1.319 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.214, backward: 0.003, update: 1.085\n",
      "[proc 1][Train](97/100000) average pos_loss: 0.27219057083129883\n",
      "[proc 1][Train](97/100000) average neg_loss: 0.5420196652412415\n",
      "[proc 1][Train](97/100000) average loss: 0.40710511803627014\n",
      "[proc 1][Train](97/100000) average regularization: 0.013102407567203045\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.201, backward: 0.003, update: 1.072\n",
      "[proc 0][Train](99/100000) average pos_loss: 0.30178698897361755\n",
      "[proc 0][Train](99/100000) average neg_loss: 0.49622318148612976\n",
      "[proc 0][Train](99/100000) average loss: 0.39900508522987366\n",
      "[proc 0][Train](99/100000) average regularization: 0.012584910728037357\n",
      "[proc 0][Train] 1 steps take 1.253 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.044\n",
      "[proc 1][Train](98/100000) average pos_loss: 0.30482980608940125\n",
      "[proc 1][Train](98/100000) average neg_loss: 0.6331691741943359\n",
      "[proc 1][Train](98/100000) average loss: 0.4689995050430298\n",
      "[proc 1][Train](98/100000) average regularization: 0.012719093821942806\n",
      "[proc 1][Train] 1 steps take 1.330 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.217, backward: 0.003, update: 1.094\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](100/100000) average pos_loss: 0.28510886430740356\n",
      "[proc 0][Train](100/100000) average neg_loss: 0.7361695170402527\n",
      "[proc 0][Train](100/100000) average loss: 0.5106391906738281\n",
      "[proc 0][Train](100/100000) average regularization: 0.01308511383831501\n",
      "[proc 0][Train] 1 steps take 1.278 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.071\n",
      "[proc 1][Train](99/100000) average pos_loss: 0.2938253581523895\n",
      "[proc 1][Train](99/100000) average neg_loss: 0.470475971698761\n",
      "[proc 1][Train](99/100000) average loss: 0.38215065002441406\n",
      "[proc 1][Train](99/100000) average regularization: 0.012825747951865196\n",
      "[proc 1][Train] 1 steps take 1.284 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.213, backward: 0.002, update: 1.067\n",
      "[proc 0][Train](101/100000) average pos_loss: 0.30746281147003174\n",
      "[proc 0][Train](101/100000) average neg_loss: 0.48244673013687134\n",
      "[proc 0][Train](101/100000) average loss: 0.39495477080345154\n",
      "[proc 0][Train](101/100000) average regularization: 0.012654775753617287\n",
      "[proc 0][Train] 1 steps take 1.318 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.103\n",
      "[proc 1][Train](100/100000) average pos_loss: 0.287445604801178\n",
      "[proc 1][Train](100/100000) average neg_loss: 0.7240116596221924\n",
      "[proc 1][Train](100/100000) average loss: 0.5057286024093628\n",
      "[proc 1][Train](100/100000) average regularization: 0.012942326255142689\n",
      "[proc 1][Train] 1 steps take 1.276 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.204, backward: 0.002, update: 1.069\n",
      "[proc 0][Train](102/100000) average pos_loss: 0.2728409767150879\n",
      "[proc 0][Train](102/100000) average neg_loss: 0.7861981391906738\n",
      "[proc 0][Train](102/100000) average loss: 0.5295195579528809\n",
      "[proc 0][Train](102/100000) average regularization: 0.013653488829731941\n",
      "[proc 0][Train] 1 steps take 1.294 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.074\n",
      "[proc 1][Train](101/100000) average pos_loss: 0.2995637357234955\n",
      "[proc 1][Train](101/100000) average neg_loss: 0.5043795704841614\n",
      "[proc 1][Train](101/100000) average loss: 0.40197163820266724\n",
      "[proc 1][Train](101/100000) average regularization: 0.01305977813899517\n",
      "[proc 1][Train] 1 steps take 1.283 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.195, backward: 0.002, update: 1.085\n",
      "[proc 0][Train](103/100000) average pos_loss: 0.32790619134902954\n",
      "[proc 0][Train](103/100000) average neg_loss: 0.41888898611068726\n",
      "[proc 0][Train](103/100000) average loss: 0.3733975887298584\n",
      "[proc 0][Train](103/100000) average regularization: 0.012567105703055859\n",
      "[proc 0][Train] 1 steps take 1.279 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.065\n",
      "[proc 1][Train](102/100000) average pos_loss: 0.3115454912185669\n",
      "[proc 1][Train](102/100000) average neg_loss: 0.6290592551231384\n",
      "[proc 1][Train](102/100000) average loss: 0.47030237317085266\n",
      "[proc 1][Train](102/100000) average regularization: 0.012683695182204247\n",
      "[proc 1][Train] 1 steps take 1.280 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.201, backward: 0.003, update: 1.075\n",
      "[proc 0][Train](104/100000) average pos_loss: 0.2778163552284241\n",
      "[proc 0][Train](104/100000) average neg_loss: 0.7371969819068909\n",
      "[proc 0][Train](104/100000) average loss: 0.5075066685676575\n",
      "[proc 0][Train](104/100000) average regularization: 0.01326009351760149\n",
      "[proc 0][Train] 1 steps take 1.322 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.105\n",
      "[proc 1][Train](103/100000) average pos_loss: 0.2840539813041687\n",
      "[proc 1][Train](103/100000) average neg_loss: 0.5592150092124939\n",
      "[proc 1][Train](103/100000) average loss: 0.4216344952583313\n",
      "[proc 1][Train](103/100000) average regularization: 0.013515239581465721\n",
      "[proc 1][Train] 1 steps take 1.313 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.095\n",
      "[proc 0][Train](105/100000) average pos_loss: 0.30063727498054504\n",
      "[proc 0][Train](105/100000) average neg_loss: 0.49456673860549927\n",
      "[proc 0][Train](105/100000) average loss: 0.39760202169418335\n",
      "[proc 0][Train](105/100000) average regularization: 0.012986714951694012\n",
      "[proc 0][Train] 1 steps take 1.357 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.139\n",
      "[proc 1][Train](104/100000) average pos_loss: 0.29732614755630493\n",
      "[proc 1][Train](104/100000) average neg_loss: 0.7198700904846191\n",
      "[proc 1][Train](104/100000) average loss: 0.5085980892181396\n",
      "[proc 1][Train](104/100000) average regularization: 0.013064494356513023\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.208, backward: 0.002, update: 1.093\n",
      "[proc 0][Train](106/100000) average pos_loss: 0.2912089228630066\n",
      "[proc 0][Train](106/100000) average neg_loss: 0.6775088906288147\n",
      "[proc 0][Train](106/100000) average loss: 0.48435890674591064\n",
      "[proc 0][Train](106/100000) average regularization: 0.013323298655450344\n",
      "[proc 0][Train] 1 steps take 1.268 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.234, backward: 0.004, update: 1.029\n",
      "[proc 1][Train](105/100000) average pos_loss: 0.316490113735199\n",
      "[proc 1][Train](105/100000) average neg_loss: 0.47090384364128113\n",
      "[proc 1][Train](105/100000) average loss: 0.39369696378707886\n",
      "[proc 1][Train](105/100000) average regularization: 0.012842511758208275\n",
      "[proc 1][Train] 1 steps take 1.285 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.195, backward: 0.002, update: 1.087\n",
      "[proc 0][Train](107/100000) average pos_loss: 0.3187071979045868\n",
      "[proc 0][Train](107/100000) average neg_loss: 0.45090585947036743\n",
      "[proc 0][Train](107/100000) average loss: 0.3848065137863159\n",
      "[proc 0][Train](107/100000) average regularization: 0.012933784164488316\n",
      "[proc 0][Train] 1 steps take 1.377 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.236, backward: 0.004, update: 1.136\n",
      "[proc 1][Train](106/100000) average pos_loss: 0.2887240946292877\n",
      "[proc 1][Train](106/100000) average neg_loss: 0.7214882969856262\n",
      "[proc 1][Train](106/100000) average loss: 0.5051062107086182\n",
      "[proc 1][Train](106/100000) average regularization: 0.013459842652082443\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.196, backward: 0.002, update: 1.096\n",
      "[proc 0][Train](108/100000) average pos_loss: 0.2730702757835388\n",
      "[proc 0][Train](108/100000) average neg_loss: 0.7610831260681152\n",
      "[proc 0][Train](108/100000) average loss: 0.5170767307281494\n",
      "[proc 0][Train](108/100000) average regularization: 0.013570098206400871\n",
      "[proc 0][Train] 1 steps take 1.404 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.245, backward: 0.005, update: 1.153\n",
      "[proc 1][Train](107/100000) average pos_loss: 0.3004494607448578\n",
      "[proc 1][Train](107/100000) average neg_loss: 0.4421941637992859\n",
      "[proc 1][Train](107/100000) average loss: 0.37132179737091064\n",
      "[proc 1][Train](107/100000) average regularization: 0.013158095069229603\n",
      "[proc 1][Train] 1 steps take 1.328 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.110\n",
      "[proc 0][Train](109/100000) average pos_loss: 0.3255455791950226\n",
      "[proc 0][Train](109/100000) average neg_loss: 0.4043208062648773\n",
      "[proc 0][Train](109/100000) average loss: 0.36493319272994995\n",
      "[proc 0][Train](109/100000) average regularization: 0.012924935668706894\n",
      "[proc 0][Train] 1 steps take 1.286 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.222, backward: 0.004, update: 1.059\n",
      "[proc 1][Train](108/100000) average pos_loss: 0.30981743335723877\n",
      "[proc 1][Train](108/100000) average neg_loss: 0.6654802560806274\n",
      "[proc 1][Train](108/100000) average loss: 0.4876488447189331\n",
      "[proc 1][Train](108/100000) average regularization: 0.013210048899054527\n",
      "[proc 1][Train] 1 steps take 1.364 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.234, backward: 0.003, update: 1.127\n",
      "[proc 0][Train](110/100000) average pos_loss: 0.2710758447647095\n",
      "[proc 0][Train](110/100000) average neg_loss: 0.8230068683624268\n",
      "[proc 0][Train](110/100000) average loss: 0.5470413565635681\n",
      "[proc 0][Train](110/100000) average regularization: 0.014052411541342735\n",
      "[proc 0][Train] 1 steps take 1.268 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.004, update: 1.051\n",
      "[proc 1][Train](109/100000) average pos_loss: 0.2827845811843872\n",
      "[proc 1][Train](109/100000) average neg_loss: 0.5299883484840393\n",
      "[proc 1][Train](109/100000) average loss: 0.40638646483421326\n",
      "[proc 1][Train](109/100000) average regularization: 0.013692421838641167\n",
      "[proc 1][Train] 1 steps take 1.324 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.107\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](111/100000) average pos_loss: 0.3280925750732422\n",
      "[proc 0][Train](111/100000) average neg_loss: 0.4951988756656647\n",
      "[proc 0][Train](111/100000) average loss: 0.41164571046829224\n",
      "[proc 0][Train](111/100000) average regularization: 0.012888947501778603\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.004, update: 1.084\n",
      "[proc 1][Train](110/100000) average pos_loss: 0.32112133502960205\n",
      "[proc 1][Train](110/100000) average neg_loss: 0.6370636820793152\n",
      "[proc 1][Train](110/100000) average loss: 0.4790925085544586\n",
      "[proc 1][Train](110/100000) average regularization: 0.013006756082177162\n",
      "[proc 1][Train] 1 steps take 1.319 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.101\n",
      "[proc 0][Train](112/100000) average pos_loss: 0.30913928151130676\n",
      "[proc 0][Train](112/100000) average neg_loss: 0.6508605480194092\n",
      "[proc 0][Train](112/100000) average loss: 0.4799998998641968\n",
      "[proc 0][Train](112/100000) average regularization: 0.013266809284687042\n",
      "[proc 0][Train] 1 steps take 1.294 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.004, update: 1.077\n",
      "[proc 1][Train](111/100000) average pos_loss: 0.2938457429409027\n",
      "[proc 1][Train](111/100000) average neg_loss: 0.4636741280555725\n",
      "[proc 1][Train](111/100000) average loss: 0.3787599205970764\n",
      "[proc 1][Train](111/100000) average regularization: 0.013299331068992615\n",
      "[proc 1][Train] 1 steps take 1.337 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.220, backward: 0.003, update: 1.113\n",
      "[proc 0][Train](113/100000) average pos_loss: 0.29701897501945496\n",
      "[proc 0][Train](113/100000) average neg_loss: 0.5058068633079529\n",
      "[proc 0][Train](113/100000) average loss: 0.4014129042625427\n",
      "[proc 0][Train](113/100000) average regularization: 0.01358600053936243\n",
      "[proc 0][Train] 1 steps take 1.295 seconds\n",
      "[proc 0]sample: 0.019, forward: 0.212, backward: 0.004, update: 1.059\n",
      "[proc 1][Train](112/100000) average pos_loss: 0.2838342785835266\n",
      "[proc 1][Train](112/100000) average neg_loss: 0.7298460006713867\n",
      "[proc 1][Train](112/100000) average loss: 0.5068401098251343\n",
      "[proc 1][Train](112/100000) average regularization: 0.013456075452268124\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.100\n",
      "[proc 0][Train](114/100000) average pos_loss: 0.28168532252311707\n",
      "[proc 0][Train](114/100000) average neg_loss: 0.7141242027282715\n",
      "[proc 0][Train](114/100000) average loss: 0.49790477752685547\n",
      "[proc 0][Train](114/100000) average regularization: 0.013746064156293869\n",
      "[proc 0][Train] 1 steps take 1.330 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.211, backward: 0.004, update: 1.099\n",
      "[proc 1][Train](113/100000) average pos_loss: 0.310232937335968\n",
      "[proc 1][Train](113/100000) average neg_loss: 0.42700523138046265\n",
      "[proc 1][Train](113/100000) average loss: 0.36861908435821533\n",
      "[proc 1][Train](113/100000) average regularization: 0.013506811112165451\n",
      "[proc 1][Train] 1 steps take 1.292 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.195, backward: 0.003, update: 1.077\n",
      "[proc 0][Train](115/100000) average pos_loss: 0.3357907831668854\n",
      "[proc 0][Train](115/100000) average neg_loss: 0.44054293632507324\n",
      "[proc 0][Train](115/100000) average loss: 0.3881668448448181\n",
      "[proc 0][Train](115/100000) average regularization: 0.012969641014933586\n",
      "[proc 0][Train] 1 steps take 1.276 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.056\n",
      "[proc 1][Train](114/100000) average pos_loss: 0.2992003262042999\n",
      "[proc 1][Train](114/100000) average neg_loss: 0.6432965993881226\n",
      "[proc 1][Train](114/100000) average loss: 0.47124844789505005\n",
      "[proc 1][Train](114/100000) average regularization: 0.013422592543065548\n",
      "[proc 1][Train] 1 steps take 1.319 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.211, backward: 0.003, update: 1.087\n",
      "[proc 0][Train](116/100000) average pos_loss: 0.2770790457725525\n",
      "[proc 0][Train](116/100000) average neg_loss: 0.7595157027244568\n",
      "[proc 0][Train](116/100000) average loss: 0.5182973742485046\n",
      "[proc 0][Train](116/100000) average regularization: 0.013945934362709522\n",
      "[proc 0][Train] 1 steps take 1.281 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.064\n",
      "[proc 1][Train](115/100000) average pos_loss: 0.2844015955924988\n",
      "[proc 1][Train](115/100000) average neg_loss: 0.49376535415649414\n",
      "[proc 1][Train](115/100000) average loss: 0.38908347487449646\n",
      "[proc 1][Train](115/100000) average regularization: 0.013772621750831604\n",
      "[proc 1][Train] 1 steps take 1.275 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.202, backward: 0.003, update: 1.069\n",
      "[proc 0][Train](117/100000) average pos_loss: 0.3154774606227875\n",
      "[proc 0][Train](117/100000) average neg_loss: 0.4317139983177185\n",
      "[proc 0][Train](117/100000) average loss: 0.3735957145690918\n",
      "[proc 0][Train](117/100000) average regularization: 0.013223222456872463\n",
      "[proc 0][Train] 1 steps take 1.319 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.109\n",
      "[proc 1][Train](116/100000) average pos_loss: 0.3082631826400757\n",
      "[proc 1][Train](116/100000) average neg_loss: 0.6892222166061401\n",
      "[proc 1][Train](116/100000) average loss: 0.4987426996231079\n",
      "[proc 1][Train](116/100000) average regularization: 0.013432393781840801\n",
      "[proc 1][Train] 1 steps take 1.308 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](118/100000) average pos_loss: 0.28771737217903137\n",
      "[proc 0][Train](118/100000) average neg_loss: 0.7241665124893188\n",
      "[proc 0][Train](118/100000) average loss: 0.5059419274330139\n",
      "[proc 0][Train](118/100000) average regularization: 0.013860129751265049\n",
      "[proc 0][Train] 1 steps take 1.359 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.269, backward: 0.003, update: 1.085\n",
      "[proc 1][Train](117/100000) average pos_loss: 0.294755220413208\n",
      "[proc 1][Train](117/100000) average neg_loss: 0.46239280700683594\n",
      "[proc 1][Train](117/100000) average loss: 0.378574013710022\n",
      "[proc 1][Train](117/100000) average regularization: 0.013823647052049637\n",
      "[proc 1][Train] 1 steps take 1.299 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.080\n",
      "[proc 0][Train](119/100000) average pos_loss: 0.31663790345191956\n",
      "[proc 0][Train](119/100000) average neg_loss: 0.4067513346672058\n",
      "[proc 0][Train](119/100000) average loss: 0.3616946339607239\n",
      "[proc 0][Train](119/100000) average regularization: 0.01351951900869608\n",
      "[proc 0][Train] 1 steps take 1.338 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.122\n",
      "[proc 1][Train](118/100000) average pos_loss: 0.30401602387428284\n",
      "[proc 1][Train](118/100000) average neg_loss: 0.623037576675415\n",
      "[proc 1][Train](118/100000) average loss: 0.46352678537368774\n",
      "[proc 1][Train](118/100000) average regularization: 0.013712998479604721\n",
      "[proc 1][Train] 1 steps take 1.324 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.108\n",
      "[proc 0][Train](120/100000) average pos_loss: 0.27793577313423157\n",
      "[proc 0][Train](120/100000) average neg_loss: 0.7420604228973389\n",
      "[proc 0][Train](120/100000) average loss: 0.509998083114624\n",
      "[proc 0][Train](120/100000) average regularization: 0.01423928327858448\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.004, update: 1.082\n",
      "[proc 1][Train](119/100000) average pos_loss: 0.28241050243377686\n",
      "[proc 1][Train](119/100000) average neg_loss: 0.44563737511634827\n",
      "[proc 1][Train](119/100000) average loss: 0.36402392387390137\n",
      "[proc 1][Train](119/100000) average regularization: 0.01420245785266161\n",
      "[proc 1][Train] 1 steps take 1.323 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.218, backward: 0.003, update: 1.101\n",
      "[proc 0][Train](121/100000) average pos_loss: 0.3246835470199585\n",
      "[proc 0][Train](121/100000) average neg_loss: 0.3994879424571991\n",
      "[proc 0][Train](121/100000) average loss: 0.36208575963974\n",
      "[proc 0][Train](121/100000) average regularization: 0.013719207607209682\n",
      "[proc 0][Train] 1 steps take 1.317 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.098\n",
      "[proc 1][Train](120/100000) average pos_loss: 0.3120076060295105\n",
      "[proc 1][Train](120/100000) average neg_loss: 0.6778647899627686\n",
      "[proc 1][Train](120/100000) average loss: 0.4949361979961395\n",
      "[proc 1][Train](120/100000) average regularization: 0.014031939208507538\n",
      "[proc 1][Train] 1 steps take 1.321 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.101\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](122/100000) average pos_loss: 0.2778048515319824\n",
      "[proc 0][Train](122/100000) average neg_loss: 0.7613751888275146\n",
      "[proc 0][Train](122/100000) average loss: 0.5195900201797485\n",
      "[proc 0][Train](122/100000) average regularization: 0.014629228040575981\n",
      "[proc 0][Train] 1 steps take 1.261 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.002, update: 1.048\n",
      "[proc 1][Train](121/100000) average pos_loss: 0.2931033968925476\n",
      "[proc 1][Train](121/100000) average neg_loss: 0.4345565438270569\n",
      "[proc 1][Train](121/100000) average loss: 0.36382997035980225\n",
      "[proc 1][Train](121/100000) average regularization: 0.014209708198904991\n",
      "[proc 1][Train] 1 steps take 1.302 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.198, backward: 0.003, update: 1.100\n",
      "[proc 0][Train](123/100000) average pos_loss: 0.3202698528766632\n",
      "[proc 0][Train](123/100000) average neg_loss: 0.4335372745990753\n",
      "[proc 0][Train](123/100000) average loss: 0.37690356373786926\n",
      "[proc 0][Train](123/100000) average regularization: 0.01362618152052164\n",
      "[proc 0][Train] 1 steps take 1.307 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.092\n",
      "[proc 1][Train](122/100000) average pos_loss: 0.315569669008255\n",
      "[proc 1][Train](122/100000) average neg_loss: 0.6376299858093262\n",
      "[proc 1][Train](122/100000) average loss: 0.4765998125076294\n",
      "[proc 1][Train](122/100000) average regularization: 0.013974979519844055\n",
      "[proc 1][Train] 1 steps take 1.378 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.223, backward: 0.003, update: 1.151\n",
      "[proc 0][Train](124/100000) average pos_loss: 0.29815763235092163\n",
      "[proc 0][Train](124/100000) average neg_loss: 0.7677171230316162\n",
      "[proc 0][Train](124/100000) average loss: 0.5329374074935913\n",
      "[proc 0][Train](124/100000) average regularization: 0.014451605267822742\n",
      "[proc 0][Train] 1 steps take 1.309 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.089\n",
      "[proc 1][Train](123/100000) average pos_loss: 0.29985231161117554\n",
      "[proc 1][Train](123/100000) average neg_loss: 0.4784112572669983\n",
      "[proc 1][Train](123/100000) average loss: 0.3891317844390869\n",
      "[proc 1][Train](123/100000) average regularization: 0.014266361482441425\n",
      "[proc 1][Train] 1 steps take 1.279 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.062\n",
      "[proc 0][Train](125/100000) average pos_loss: 0.3244551420211792\n",
      "[proc 0][Train](125/100000) average neg_loss: 0.43253132700920105\n",
      "[proc 0][Train](125/100000) average loss: 0.3784932494163513\n",
      "[proc 0][Train](125/100000) average regularization: 0.01382469106465578\n",
      "[proc 0][Train] 1 steps take 1.275 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.199, backward: 0.003, update: 1.071\n",
      "[proc 1][Train](124/100000) average pos_loss: 0.32099056243896484\n",
      "[proc 1][Train](124/100000) average neg_loss: 0.6314866542816162\n",
      "[proc 1][Train](124/100000) average loss: 0.4762386083602905\n",
      "[proc 1][Train](124/100000) average regularization: 0.013775323517620564\n",
      "[proc 1][Train] 1 steps take 1.310 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](126/100000) average pos_loss: 0.2896122932434082\n",
      "[proc 0][Train](126/100000) average neg_loss: 0.7194036245346069\n",
      "[proc 0][Train](126/100000) average loss: 0.5045079588890076\n",
      "[proc 0][Train](126/100000) average regularization: 0.014279632829129696\n",
      "[proc 0][Train] 1 steps take 1.419 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.302, backward: 0.002, update: 1.113\n",
      "[proc 1][Train](125/100000) average pos_loss: 0.2944699227809906\n",
      "[proc 1][Train](125/100000) average neg_loss: 0.4396964907646179\n",
      "[proc 1][Train](125/100000) average loss: 0.36708319187164307\n",
      "[proc 1][Train](125/100000) average regularization: 0.01430260669440031\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.097\n",
      "[proc 0][Train](127/100000) average pos_loss: 0.3117239475250244\n",
      "[proc 0][Train](127/100000) average neg_loss: 0.42120030522346497\n",
      "[proc 0][Train](127/100000) average loss: 0.3664621114730835\n",
      "[proc 0][Train](127/100000) average regularization: 0.014069235883653164\n",
      "[proc 0][Train] 1 steps take 1.311 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.090\n",
      "[proc 1][Train](126/100000) average pos_loss: 0.3059117794036865\n",
      "[proc 1][Train](126/100000) average neg_loss: 0.636533260345459\n",
      "[proc 1][Train](126/100000) average loss: 0.47122251987457275\n",
      "[proc 1][Train](126/100000) average regularization: 0.014167655259370804\n",
      "[proc 1][Train] 1 steps take 1.319 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.100\n",
      "[proc 0][Train](128/100000) average pos_loss: 0.2839607000350952\n",
      "[proc 0][Train](128/100000) average neg_loss: 0.7141050100326538\n",
      "[proc 0][Train](128/100000) average loss: 0.4990328550338745\n",
      "[proc 0][Train](128/100000) average regularization: 0.01471740659326315\n",
      "[proc 0][Train] 1 steps take 1.282 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.004, update: 1.067\n",
      "[proc 1][Train](127/100000) average pos_loss: 0.2951969802379608\n",
      "[proc 1][Train](127/100000) average neg_loss: 0.4611737132072449\n",
      "[proc 1][Train](127/100000) average loss: 0.37818533182144165\n",
      "[proc 1][Train](127/100000) average regularization: 0.014417783357203007\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](129/100000) average pos_loss: 0.3253048062324524\n",
      "[proc 0][Train](129/100000) average neg_loss: 0.45873403549194336\n",
      "[proc 0][Train](129/100000) average loss: 0.3920194208621979\n",
      "[proc 0][Train](129/100000) average regularization: 0.013962414115667343\n",
      "[proc 0][Train] 1 steps take 1.313 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.209, backward: 0.003, update: 1.083\n",
      "[proc 1][Train](128/100000) average pos_loss: 0.3099604845046997\n",
      "[proc 1][Train](128/100000) average neg_loss: 0.6598275899887085\n",
      "[proc 1][Train](128/100000) average loss: 0.4848940372467041\n",
      "[proc 1][Train](128/100000) average regularization: 0.014422507956624031\n",
      "[proc 1][Train] 1 steps take 1.323 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.104\n",
      "[proc 0][Train](130/100000) average pos_loss: 0.2913767993450165\n",
      "[proc 0][Train](130/100000) average neg_loss: 0.7124511003494263\n",
      "[proc 0][Train](130/100000) average loss: 0.5019139647483826\n",
      "[proc 0][Train](130/100000) average regularization: 0.014429687522351742\n",
      "[proc 0][Train] 1 steps take 1.330 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.210, backward: 0.003, update: 1.100\n",
      "[proc 1][Train](129/100000) average pos_loss: 0.3026978373527527\n",
      "[proc 1][Train](129/100000) average neg_loss: 0.3932100534439087\n",
      "[proc 1][Train](129/100000) average loss: 0.3479539453983307\n",
      "[proc 1][Train](129/100000) average regularization: 0.014407623559236526\n",
      "[proc 1][Train] 1 steps take 1.324 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.219, backward: 0.003, update: 1.086\n",
      "[proc 0][Train](131/100000) average pos_loss: 0.3248441219329834\n",
      "[proc 0][Train](131/100000) average neg_loss: 0.4219927489757538\n",
      "[proc 0][Train](131/100000) average loss: 0.3734184503555298\n",
      "[proc 0][Train](131/100000) average regularization: 0.014115191996097565\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.085\n",
      "[proc 1][Train](130/100000) average pos_loss: 0.30161017179489136\n",
      "[proc 1][Train](130/100000) average neg_loss: 0.6624218225479126\n",
      "[proc 1][Train](130/100000) average loss: 0.482015997171402\n",
      "[proc 1][Train](130/100000) average regularization: 0.014443536289036274\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.214, backward: 0.003, update: 1.083\n",
      "[proc 0][Train](132/100000) average pos_loss: 0.27777037024497986\n",
      "[proc 0][Train](132/100000) average neg_loss: 0.6997100114822388\n",
      "[proc 0][Train](132/100000) average loss: 0.4887402057647705\n",
      "[proc 0][Train](132/100000) average regularization: 0.014827617444097996\n",
      "[proc 0][Train] 1 steps take 1.292 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](131/100000) average pos_loss: 0.2995782792568207\n",
      "[proc 1][Train](131/100000) average neg_loss: 0.43203893303871155\n",
      "[proc 1][Train](131/100000) average loss: 0.3658086061477661\n",
      "[proc 1][Train](131/100000) average regularization: 0.01452399231493473\n",
      "[proc 1][Train] 1 steps take 1.310 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.091\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](133/100000) average pos_loss: 0.31979313492774963\n",
      "[proc 0][Train](133/100000) average neg_loss: 0.39531630277633667\n",
      "[proc 0][Train](133/100000) average loss: 0.35755473375320435\n",
      "[proc 0][Train](133/100000) average regularization: 0.014386249706149101\n",
      "[proc 0][Train] 1 steps take 1.294 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.080\n",
      "[proc 1][Train](132/100000) average pos_loss: 0.3139960467815399\n",
      "[proc 1][Train](132/100000) average neg_loss: 0.6749774217605591\n",
      "[proc 1][Train](132/100000) average loss: 0.4944867491722107\n",
      "[proc 1][Train](132/100000) average regularization: 0.014227389357984066\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.097\n",
      "[proc 0][Train](134/100000) average pos_loss: 0.2913999557495117\n",
      "[proc 0][Train](134/100000) average neg_loss: 0.762364387512207\n",
      "[proc 0][Train](134/100000) average loss: 0.5268821716308594\n",
      "[proc 0][Train](134/100000) average regularization: 0.014941615983843803\n",
      "[proc 0][Train] 1 steps take 1.361 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.144\n",
      "[proc 1][Train](133/100000) average pos_loss: 0.2981923520565033\n",
      "[proc 1][Train](133/100000) average neg_loss: 0.42452141642570496\n",
      "[proc 1][Train](133/100000) average loss: 0.3613568842411041\n",
      "[proc 1][Train](133/100000) average regularization: 0.01478723343461752\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.215, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](135/100000) average pos_loss: 0.3378360867500305\n",
      "[proc 0][Train](135/100000) average neg_loss: 0.4037385582923889\n",
      "[proc 0][Train](135/100000) average loss: 0.3707873225212097\n",
      "[proc 0][Train](135/100000) average regularization: 0.014175264164805412\n",
      "[proc 0][Train] 1 steps take 1.305 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.004, update: 1.084\n",
      "[proc 1][Train](134/100000) average pos_loss: 0.3149757385253906\n",
      "[proc 1][Train](134/100000) average neg_loss: 0.6242772936820984\n",
      "[proc 1][Train](134/100000) average loss: 0.4696265161037445\n",
      "[proc 1][Train](134/100000) average regularization: 0.0145803801715374\n",
      "[proc 1][Train] 1 steps take 1.287 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.068\n",
      "[proc 0][Train](136/100000) average pos_loss: 0.2955392599105835\n",
      "[proc 0][Train](136/100000) average neg_loss: 0.7237178087234497\n",
      "[proc 0][Train](136/100000) average loss: 0.5096285343170166\n",
      "[proc 0][Train](136/100000) average regularization: 0.01480141095817089\n",
      "[proc 0][Train] 1 steps take 1.307 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.089\n",
      "[proc 1][Train](135/100000) average pos_loss: 0.29330188035964966\n",
      "[proc 1][Train](135/100000) average neg_loss: 0.4166192412376404\n",
      "[proc 1][Train](135/100000) average loss: 0.354960560798645\n",
      "[proc 1][Train](135/100000) average regularization: 0.014913002029061317\n",
      "[proc 1][Train] 1 steps take 1.300 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.083\n",
      "[proc 0][Train](137/100000) average pos_loss: 0.3130759596824646\n",
      "[proc 0][Train](137/100000) average neg_loss: 0.42303138971328735\n",
      "[proc 0][Train](137/100000) average loss: 0.368053674697876\n",
      "[proc 0][Train](137/100000) average regularization: 0.014527353458106518\n",
      "[proc 0][Train] 1 steps take 1.317 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.098\n",
      "[proc 1][Train](136/100000) average pos_loss: 0.2954282760620117\n",
      "[proc 1][Train](136/100000) average neg_loss: 0.695931613445282\n",
      "[proc 1][Train](136/100000) average loss: 0.49567994475364685\n",
      "[proc 1][Train](136/100000) average regularization: 0.014748563058674335\n",
      "[proc 1][Train] 1 steps take 1.290 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.075\n",
      "[proc 0][Train](138/100000) average pos_loss: 0.2902339994907379\n",
      "[proc 0][Train](138/100000) average neg_loss: 0.8018867373466492\n",
      "[proc 0][Train](138/100000) average loss: 0.5460603833198547\n",
      "[proc 0][Train](138/100000) average regularization: 0.014880876988172531\n",
      "[proc 0][Train] 1 steps take 1.293 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.075\n",
      "[proc 1][Train](137/100000) average pos_loss: 0.3128897249698639\n",
      "[proc 1][Train](137/100000) average neg_loss: 0.40994390845298767\n",
      "[proc 1][Train](137/100000) average loss: 0.3614168167114258\n",
      "[proc 1][Train](137/100000) average regularization: 0.014495779760181904\n",
      "[proc 1][Train] 1 steps take 1.320 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.101\n",
      "[proc 0][Train](139/100000) average pos_loss: 0.3429012596607208\n",
      "[proc 0][Train](139/100000) average neg_loss: 0.4067654311656952\n",
      "[proc 0][Train](139/100000) average loss: 0.374833345413208\n",
      "[proc 0][Train](139/100000) average regularization: 0.014040864072740078\n",
      "[proc 0][Train] 1 steps take 1.257 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.048\n",
      "[proc 1][Train](138/100000) average pos_loss: 0.3183823227882385\n",
      "[proc 1][Train](138/100000) average neg_loss: 0.6964668035507202\n",
      "[proc 1][Train](138/100000) average loss: 0.5074245929718018\n",
      "[proc 1][Train](138/100000) average regularization: 0.014415334910154343\n",
      "[proc 1][Train] 1 steps take 1.345 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.223, backward: 0.003, update: 1.117\n",
      "[proc 0][Train](140/100000) average pos_loss: 0.28456804156303406\n",
      "[proc 0][Train](140/100000) average neg_loss: 0.717840313911438\n",
      "[proc 0][Train](140/100000) average loss: 0.5012041926383972\n",
      "[proc 0][Train](140/100000) average regularization: 0.014926915988326073\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.002, update: 1.091\n",
      "[proc 1][Train](139/100000) average pos_loss: 0.30829745531082153\n",
      "[proc 1][Train](139/100000) average neg_loss: 0.39475464820861816\n",
      "[proc 1][Train](139/100000) average loss: 0.35152605175971985\n",
      "[proc 1][Train](139/100000) average regularization: 0.014530127868056297\n",
      "[proc 1][Train] 1 steps take 1.292 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.074\n",
      "[proc 0][Train](141/100000) average pos_loss: 0.3338148593902588\n",
      "[proc 0][Train](141/100000) average neg_loss: 0.410054475069046\n",
      "[proc 0][Train](141/100000) average loss: 0.3719346523284912\n",
      "[proc 0][Train](141/100000) average regularization: 0.014225402846932411\n",
      "[proc 0][Train] 1 steps take 1.316 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.099\n",
      "[proc 1][Train](140/100000) average pos_loss: 0.3181423842906952\n",
      "[proc 1][Train](140/100000) average neg_loss: 0.671011745929718\n",
      "[proc 1][Train](140/100000) average loss: 0.4945770502090454\n",
      "[proc 1][Train](140/100000) average regularization: 0.014517668634653091\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.087\n",
      "[proc 0][Train](142/100000) average pos_loss: 0.29331305623054504\n",
      "[proc 0][Train](142/100000) average neg_loss: 0.7000111937522888\n",
      "[proc 0][Train](142/100000) average loss: 0.4966621398925781\n",
      "[proc 0][Train](142/100000) average regularization: 0.014865139499306679\n",
      "[proc 0][Train] 1 steps take 1.318 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.102\n",
      "[proc 1][Train](141/100000) average pos_loss: 0.30312275886535645\n",
      "[proc 1][Train](141/100000) average neg_loss: 0.41760268807411194\n",
      "[proc 1][Train](141/100000) average loss: 0.360362708568573\n",
      "[proc 1][Train](141/100000) average regularization: 0.014698097482323647\n",
      "[proc 1][Train] 1 steps take 1.299 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](143/100000) average pos_loss: 0.3292326331138611\n",
      "[proc 0][Train](143/100000) average neg_loss: 0.3846668303012848\n",
      "[proc 0][Train](143/100000) average loss: 0.35694974660873413\n",
      "[proc 0][Train](143/100000) average regularization: 0.014436569064855576\n",
      "[proc 0][Train] 1 steps take 1.343 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.239, backward: 0.003, update: 1.100\n",
      "[proc 1][Train](142/100000) average pos_loss: 0.30629977583885193\n",
      "[proc 1][Train](142/100000) average neg_loss: 0.6908671855926514\n",
      "[proc 1][Train](142/100000) average loss: 0.49858349561691284\n",
      "[proc 1][Train](142/100000) average regularization: 0.014600067399442196\n",
      "[proc 1][Train] 1 steps take 1.313 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.220, backward: 0.003, update: 1.089\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](144/100000) average pos_loss: 0.28493452072143555\n",
      "[proc 0][Train](144/100000) average neg_loss: 0.6906355023384094\n",
      "[proc 0][Train](144/100000) average loss: 0.4877850115299225\n",
      "[proc 0][Train](144/100000) average regularization: 0.014978306367993355\n",
      "[proc 0][Train] 1 steps take 1.310 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.092\n",
      "[proc 1][Train](143/100000) average pos_loss: 0.308117151260376\n",
      "[proc 1][Train](143/100000) average neg_loss: 0.3902253806591034\n",
      "[proc 1][Train](143/100000) average loss: 0.3491712808609009\n",
      "[proc 1][Train](143/100000) average regularization: 0.014567329548299313\n",
      "[proc 1][Train] 1 steps take 1.300 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](145/100000) average pos_loss: 0.32951682806015015\n",
      "[proc 0][Train](145/100000) average neg_loss: 0.3861730992794037\n",
      "[proc 0][Train](145/100000) average loss: 0.3578449487686157\n",
      "[proc 0][Train](145/100000) average regularization: 0.014352090656757355\n",
      "[proc 0][Train] 1 steps take 1.371 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.208, backward: 0.004, update: 1.141\n",
      "[proc 1][Train](144/100000) average pos_loss: 0.3030094802379608\n",
      "[proc 1][Train](144/100000) average neg_loss: 0.6891292333602905\n",
      "[proc 1][Train](144/100000) average loss: 0.49606937170028687\n",
      "[proc 1][Train](144/100000) average regularization: 0.014761298894882202\n",
      "[proc 1][Train] 1 steps take 1.330 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.113\n",
      "[proc 0][Train](146/100000) average pos_loss: 0.2850605845451355\n",
      "[proc 0][Train](146/100000) average neg_loss: 0.7698847055435181\n",
      "[proc 0][Train](146/100000) average loss: 0.5274726152420044\n",
      "[proc 0][Train](146/100000) average regularization: 0.015117663890123367\n",
      "[proc 0][Train] 1 steps take 1.330 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.215, backward: 0.004, update: 1.094\n",
      "[proc 1][Train](145/100000) average pos_loss: 0.2962368130683899\n",
      "[proc 1][Train](145/100000) average neg_loss: 0.40117406845092773\n",
      "[proc 1][Train](145/100000) average loss: 0.3487054407596588\n",
      "[proc 1][Train](145/100000) average regularization: 0.014865400269627571\n",
      "[proc 1][Train] 1 steps take 1.347 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.211, backward: 0.003, update: 1.115\n",
      "[proc 0][Train](147/100000) average pos_loss: 0.3338450789451599\n",
      "[proc 0][Train](147/100000) average neg_loss: 0.35968488454818726\n",
      "[proc 0][Train](147/100000) average loss: 0.3467649817466736\n",
      "[proc 0][Train](147/100000) average regularization: 0.014512840658426285\n",
      "[proc 0][Train] 1 steps take 1.289 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](146/100000) average pos_loss: 0.3179088234901428\n",
      "[proc 1][Train](146/100000) average neg_loss: 0.6781522035598755\n",
      "[proc 1][Train](146/100000) average loss: 0.49803051352500916\n",
      "[proc 1][Train](146/100000) average regularization: 0.014670653268694878\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.020, forward: 0.199, backward: 0.003, update: 1.082\n",
      "[proc 0][Train](148/100000) average pos_loss: 0.28433096408843994\n",
      "[proc 0][Train](148/100000) average neg_loss: 0.7017495632171631\n",
      "[proc 0][Train](148/100000) average loss: 0.4930402636528015\n",
      "[proc 0][Train](148/100000) average regularization: 0.015140373259782791\n",
      "[proc 0][Train] 1 steps take 1.265 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.201, backward: 0.004, update: 1.059\n",
      "[proc 1][Train](147/100000) average pos_loss: 0.3028298616409302\n",
      "[proc 1][Train](147/100000) average neg_loss: 0.40906858444213867\n",
      "[proc 1][Train](147/100000) average loss: 0.3559492230415344\n",
      "[proc 1][Train](147/100000) average regularization: 0.0147317536175251\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.201, backward: 0.003, update: 1.103\n",
      "[proc 0][Train](149/100000) average pos_loss: 0.3222416937351227\n",
      "[proc 0][Train](149/100000) average neg_loss: 0.4171358346939087\n",
      "[proc 0][Train](149/100000) average loss: 0.3696887493133545\n",
      "[proc 0][Train](149/100000) average regularization: 0.01464350987225771\n",
      "[proc 0][Train] 1 steps take 1.314 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.004, update: 1.094\n",
      "[proc 1][Train](148/100000) average pos_loss: 0.3107241988182068\n",
      "[proc 1][Train](148/100000) average neg_loss: 0.7299201488494873\n",
      "[proc 1][Train](148/100000) average loss: 0.5203222036361694\n",
      "[proc 1][Train](148/100000) average regularization: 0.014820348471403122\n",
      "[proc 1][Train] 1 steps take 1.305 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.089\n",
      "[proc 0][Train](150/100000) average pos_loss: 0.2954161763191223\n",
      "[proc 0][Train](150/100000) average neg_loss: 0.7073203921318054\n",
      "[proc 0][Train](150/100000) average loss: 0.5013682842254639\n",
      "[proc 0][Train](150/100000) average regularization: 0.015084273181855679\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.004, update: 1.082\n",
      "[proc 1][Train](149/100000) average pos_loss: 0.32205381989479065\n",
      "[proc 1][Train](149/100000) average neg_loss: 0.40923064947128296\n",
      "[proc 1][Train](149/100000) average loss: 0.365642249584198\n",
      "[proc 1][Train](149/100000) average regularization: 0.01465540286153555\n",
      "[proc 1][Train] 1 steps take 1.308 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.088\n",
      "[proc 0][Train](151/100000) average pos_loss: 0.34457266330718994\n",
      "[proc 0][Train](151/100000) average neg_loss: 0.3883495330810547\n",
      "[proc 0][Train](151/100000) average loss: 0.3664610981941223\n",
      "[proc 0][Train](151/100000) average regularization: 0.014506887644529343\n",
      "[proc 0][Train] 1 steps take 1.300 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.085\n",
      "[proc 1][Train](150/100000) average pos_loss: 0.3268066346645355\n",
      "[proc 1][Train](150/100000) average neg_loss: 0.6827260255813599\n",
      "[proc 1][Train](150/100000) average loss: 0.5047663450241089\n",
      "[proc 1][Train](150/100000) average regularization: 0.01466839574277401\n",
      "[proc 1][Train] 1 steps take 1.297 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.200, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](152/100000) average pos_loss: 0.29116183519363403\n",
      "[proc 0][Train](152/100000) average neg_loss: 0.6853150129318237\n",
      "[proc 0][Train](152/100000) average loss: 0.4882384240627289\n",
      "[proc 0][Train](152/100000) average regularization: 0.014983639121055603\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.085\n",
      "[proc 1][Train](151/100000) average pos_loss: 0.3080555200576782\n",
      "[proc 1][Train](151/100000) average neg_loss: 0.4088839888572693\n",
      "[proc 1][Train](151/100000) average loss: 0.35846975445747375\n",
      "[proc 1][Train](151/100000) average regularization: 0.014677147381007671\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.085\n",
      "[proc 0][Train](153/100000) average pos_loss: 0.3373384475708008\n",
      "[proc 0][Train](153/100000) average neg_loss: 0.3773996829986572\n",
      "[proc 0][Train](153/100000) average loss: 0.357369065284729\n",
      "[proc 0][Train](153/100000) average regularization: 0.014698577113449574\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.086\n",
      "[proc 1][Train](152/100000) average pos_loss: 0.3182229697704315\n",
      "[proc 1][Train](152/100000) average neg_loss: 0.6631207466125488\n",
      "[proc 1][Train](152/100000) average loss: 0.49067187309265137\n",
      "[proc 1][Train](152/100000) average regularization: 0.01464800350368023\n",
      "[proc 1][Train] 1 steps take 1.292 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.084\n",
      "[proc 0][Train](154/100000) average pos_loss: 0.29146862030029297\n",
      "[proc 0][Train](154/100000) average neg_loss: 0.7300131320953369\n",
      "[proc 0][Train](154/100000) average loss: 0.5107408761978149\n",
      "[proc 0][Train](154/100000) average regularization: 0.015112683176994324\n",
      "[proc 0][Train] 1 steps take 1.259 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.202, backward: 0.003, update: 1.052\n",
      "[proc 1][Train](153/100000) average pos_loss: 0.2961887717247009\n",
      "[proc 1][Train](153/100000) average neg_loss: 0.41044992208480835\n",
      "[proc 1][Train](153/100000) average loss: 0.35331934690475464\n",
      "[proc 1][Train](153/100000) average regularization: 0.01491253636777401\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.096\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](155/100000) average pos_loss: 0.326238751411438\n",
      "[proc 0][Train](155/100000) average neg_loss: 0.41304466128349304\n",
      "[proc 0][Train](155/100000) average loss: 0.3696417212486267\n",
      "[proc 0][Train](155/100000) average regularization: 0.014556439593434334\n",
      "[proc 0][Train] 1 steps take 1.330 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.112\n",
      "[proc 1][Train](154/100000) average pos_loss: 0.3191699981689453\n",
      "[proc 1][Train](154/100000) average neg_loss: 0.6573278307914734\n",
      "[proc 1][Train](154/100000) average loss: 0.48824891448020935\n",
      "[proc 1][Train](154/100000) average regularization: 0.01475348137319088\n",
      "[proc 1][Train] 1 steps take 1.338 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.118\n",
      "[proc 0][Train](156/100000) average pos_loss: 0.2945241928100586\n",
      "[proc 0][Train](156/100000) average neg_loss: 0.6723712682723999\n",
      "[proc 0][Train](156/100000) average loss: 0.48344773054122925\n",
      "[proc 0][Train](156/100000) average regularization: 0.015202881768345833\n",
      "[proc 0][Train] 1 steps take 1.309 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.091\n",
      "[proc 1][Train](155/100000) average pos_loss: 0.30976980924606323\n",
      "[proc 1][Train](155/100000) average neg_loss: 0.38413915038108826\n",
      "[proc 1][Train](155/100000) average loss: 0.34695446491241455\n",
      "[proc 1][Train](155/100000) average regularization: 0.01488393172621727\n",
      "[proc 1][Train] 1 steps take 1.325 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.107\n",
      "[proc 0][Train](157/100000) average pos_loss: 0.3300848603248596\n",
      "[proc 0][Train](157/100000) average neg_loss: 0.3889749050140381\n",
      "[proc 0][Train](157/100000) average loss: 0.35952988266944885\n",
      "[proc 0][Train](157/100000) average regularization: 0.014726212248206139\n",
      "[proc 0][Train] 1 steps take 1.305 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.087\n",
      "[proc 1][Train](156/100000) average pos_loss: 0.31018930673599243\n",
      "[proc 1][Train](156/100000) average neg_loss: 0.7045217752456665\n",
      "[proc 1][Train](156/100000) average loss: 0.5073555707931519\n",
      "[proc 1][Train](156/100000) average regularization: 0.015038692392408848\n",
      "[proc 1][Train] 1 steps take 1.351 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.133\n",
      "[proc 0][Train](158/100000) average pos_loss: 0.28896400332450867\n",
      "[proc 0][Train](158/100000) average neg_loss: 0.8040430545806885\n",
      "[proc 0][Train](158/100000) average loss: 0.5465035438537598\n",
      "[proc 0][Train](158/100000) average regularization: 0.015360192395746708\n",
      "[proc 0][Train] 1 steps take 1.326 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.106\n",
      "[proc 1][Train](157/100000) average pos_loss: 0.3034864068031311\n",
      "[proc 1][Train](157/100000) average neg_loss: 0.41732925176620483\n",
      "[proc 1][Train](157/100000) average loss: 0.36040782928466797\n",
      "[proc 1][Train](157/100000) average regularization: 0.015106786042451859\n",
      "[proc 1][Train] 1 steps take 1.293 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.076\n",
      "[proc 0][Train](159/100000) average pos_loss: 0.3556510806083679\n",
      "[proc 0][Train](159/100000) average neg_loss: 0.37479832768440247\n",
      "[proc 0][Train](159/100000) average loss: 0.3652247190475464\n",
      "[proc 0][Train](159/100000) average regularization: 0.014395046979188919\n",
      "[proc 0][Train] 1 steps take 1.292 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.077\n",
      "[proc 1][Train](158/100000) average pos_loss: 0.3358927071094513\n",
      "[proc 1][Train](158/100000) average neg_loss: 0.6076107025146484\n",
      "[proc 1][Train](158/100000) average loss: 0.47175168991088867\n",
      "[proc 1][Train](158/100000) average regularization: 0.01462630182504654\n",
      "[proc 1][Train] 1 steps take 1.271 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.053\n",
      "[proc 0][Train](160/100000) average pos_loss: 0.3026488721370697\n",
      "[proc 0][Train](160/100000) average neg_loss: 0.6610146760940552\n",
      "[proc 0][Train](160/100000) average loss: 0.48183178901672363\n",
      "[proc 0][Train](160/100000) average regularization: 0.014999683015048504\n",
      "[proc 0][Train] 1 steps take 1.332 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.195, backward: 0.002, update: 1.134\n",
      "[proc 1][Train](159/100000) average pos_loss: 0.3062537610530853\n",
      "[proc 1][Train](159/100000) average neg_loss: 0.4056721329689026\n",
      "[proc 1][Train](159/100000) average loss: 0.35596293210983276\n",
      "[proc 1][Train](159/100000) average regularization: 0.014960609376430511\n",
      "[proc 1][Train] 1 steps take 1.346 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.130\n",
      "[proc 0][Train](161/100000) average pos_loss: 0.3289772868156433\n",
      "[proc 0][Train](161/100000) average neg_loss: 0.3835980296134949\n",
      "[proc 0][Train](161/100000) average loss: 0.3562876582145691\n",
      "[proc 0][Train](161/100000) average regularization: 0.014716574922204018\n",
      "[proc 0][Train] 1 steps take 1.392 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.223, backward: 0.003, update: 1.151\n",
      "[proc 1][Train](160/100000) average pos_loss: 0.30750566720962524\n",
      "[proc 1][Train](160/100000) average neg_loss: 0.640671968460083\n",
      "[proc 1][Train](160/100000) average loss: 0.4740888178348541\n",
      "[proc 1][Train](160/100000) average regularization: 0.014968443661928177\n",
      "[proc 1][Train] 1 steps take 1.292 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.002, update: 1.080\n",
      "[proc 0][Train](162/100000) average pos_loss: 0.28844866156578064\n",
      "[proc 0][Train](162/100000) average neg_loss: 0.7292443513870239\n",
      "[proc 0][Train](162/100000) average loss: 0.5088465213775635\n",
      "[proc 0][Train](162/100000) average regularization: 0.015323445200920105\n",
      "[proc 0][Train] 1 steps take 1.323 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.220, backward: 0.003, update: 1.083\n",
      "[proc 1][Train](161/100000) average pos_loss: 0.30516505241394043\n",
      "[proc 1][Train](161/100000) average neg_loss: 0.370172381401062\n",
      "[proc 1][Train](161/100000) average loss: 0.3376687169075012\n",
      "[proc 1][Train](161/100000) average regularization: 0.0150402020663023\n",
      "[proc 1][Train] 1 steps take 1.319 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.212, backward: 0.003, update: 1.088\n",
      "[proc 0][Train](163/100000) average pos_loss: 0.3415418863296509\n",
      "[proc 0][Train](163/100000) average neg_loss: 0.36352628469467163\n",
      "[proc 0][Train](163/100000) average loss: 0.35253408551216125\n",
      "[proc 0][Train](163/100000) average regularization: 0.014752916060388088\n",
      "[proc 0][Train] 1 steps take 1.297 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.082\n",
      "[proc 1][Train](162/100000) average pos_loss: 0.3122968077659607\n",
      "[proc 1][Train](162/100000) average neg_loss: 0.6883546710014343\n",
      "[proc 1][Train](162/100000) average loss: 0.5003257393836975\n",
      "[proc 1][Train](162/100000) average regularization: 0.014975540339946747\n",
      "[proc 1][Train] 1 steps take 1.333 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.212, backward: 0.003, update: 1.101\n",
      "[proc 0][Train](164/100000) average pos_loss: 0.28670087456703186\n",
      "[proc 0][Train](164/100000) average neg_loss: 0.721419632434845\n",
      "[proc 0][Train](164/100000) average loss: 0.5040602684020996\n",
      "[proc 0][Train](164/100000) average regularization: 0.01534226443618536\n",
      "[proc 0][Train] 1 steps take 1.318 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.103\n",
      "[proc 1][Train](163/100000) average pos_loss: 0.2918414771556854\n",
      "[proc 1][Train](163/100000) average neg_loss: 0.4135923981666565\n",
      "[proc 1][Train](163/100000) average loss: 0.35271692276000977\n",
      "[proc 1][Train](163/100000) average regularization: 0.01529447641223669\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.090\n",
      "[proc 0][Train](165/100000) average pos_loss: 0.3340328335762024\n",
      "[proc 0][Train](165/100000) average neg_loss: 0.3740423321723938\n",
      "[proc 0][Train](165/100000) average loss: 0.3540375828742981\n",
      "[proc 0][Train](165/100000) average regularization: 0.01483384519815445\n",
      "[proc 0][Train] 1 steps take 1.321 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.212, backward: 0.003, update: 1.105\n",
      "[proc 1][Train](164/100000) average pos_loss: 0.3194120526313782\n",
      "[proc 1][Train](164/100000) average neg_loss: 0.6649432182312012\n",
      "[proc 1][Train](164/100000) average loss: 0.4921776354312897\n",
      "[proc 1][Train](164/100000) average regularization: 0.015035795047879219\n",
      "[proc 1][Train] 1 steps take 1.328 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.219, backward: 0.003, update: 1.104\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](166/100000) average pos_loss: 0.29795703291893005\n",
      "[proc 0][Train](166/100000) average neg_loss: 0.6661202907562256\n",
      "[proc 0][Train](166/100000) average loss: 0.482038676738739\n",
      "[proc 0][Train](166/100000) average regularization: 0.015323737636208534\n",
      "[proc 0][Train] 1 steps take 1.284 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.069\n",
      "[proc 1][Train](165/100000) average pos_loss: 0.3094629943370819\n",
      "[proc 1][Train](165/100000) average neg_loss: 0.3983577489852905\n",
      "[proc 1][Train](165/100000) average loss: 0.3539103865623474\n",
      "[proc 1][Train](165/100000) average regularization: 0.01521996594965458\n",
      "[proc 1][Train] 1 steps take 1.312 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](167/100000) average pos_loss: 0.33827900886535645\n",
      "[proc 0][Train](167/100000) average neg_loss: 0.3772274851799011\n",
      "[proc 0][Train](167/100000) average loss: 0.3577532470226288\n",
      "[proc 0][Train](167/100000) average regularization: 0.014885027892887592\n",
      "[proc 0][Train] 1 steps take 1.397 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.225, backward: 0.003, update: 1.167\n",
      "[proc 1][Train](166/100000) average pos_loss: 0.32083308696746826\n",
      "[proc 1][Train](166/100000) average neg_loss: 0.6601436138153076\n",
      "[proc 1][Train](166/100000) average loss: 0.49048835039138794\n",
      "[proc 1][Train](166/100000) average regularization: 0.015112337656319141\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.099\n",
      "[proc 0][Train](168/100000) average pos_loss: 0.28076791763305664\n",
      "[proc 0][Train](168/100000) average neg_loss: 0.7281904220581055\n",
      "[proc 0][Train](168/100000) average loss: 0.504479169845581\n",
      "[proc 0][Train](168/100000) average regularization: 0.015557161532342434\n",
      "[proc 0][Train] 1 steps take 1.284 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.066\n",
      "[proc 1][Train](167/100000) average pos_loss: 0.304296612739563\n",
      "[proc 1][Train](167/100000) average neg_loss: 0.409233957529068\n",
      "[proc 1][Train](167/100000) average loss: 0.3567652702331543\n",
      "[proc 1][Train](167/100000) average regularization: 0.015321466140449047\n",
      "[proc 1][Train] 1 steps take 1.298 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.079\n",
      "[proc 0][Train](169/100000) average pos_loss: 0.3468169569969177\n",
      "[proc 0][Train](169/100000) average neg_loss: 0.3894643187522888\n",
      "[proc 0][Train](169/100000) average loss: 0.36814063787460327\n",
      "[proc 0][Train](169/100000) average regularization: 0.01487138494849205\n",
      "[proc 0][Train] 1 steps take 1.289 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.071\n",
      "[proc 1][Train](168/100000) average pos_loss: 0.3285224437713623\n",
      "[proc 1][Train](168/100000) average neg_loss: 0.6242191791534424\n",
      "[proc 1][Train](168/100000) average loss: 0.47637081146240234\n",
      "[proc 1][Train](168/100000) average regularization: 0.015040673315525055\n",
      "[proc 1][Train] 1 steps take 1.299 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.084\n",
      "[proc 0][Train](170/100000) average pos_loss: 0.3030015528202057\n",
      "[proc 0][Train](170/100000) average neg_loss: 0.6660817265510559\n",
      "[proc 0][Train](170/100000) average loss: 0.484541654586792\n",
      "[proc 0][Train](170/100000) average regularization: 0.015439840964972973\n",
      "[proc 0][Train] 1 steps take 1.287 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.213, backward: 0.003, update: 1.069\n",
      "[proc 1][Train](169/100000) average pos_loss: 0.30254119634628296\n",
      "[proc 1][Train](169/100000) average neg_loss: 0.35027745366096497\n",
      "[proc 1][Train](169/100000) average loss: 0.32640933990478516\n",
      "[proc 1][Train](169/100000) average regularization: 0.015292253345251083\n",
      "[proc 1][Train] 1 steps take 1.279 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.202, backward: 0.003, update: 1.073\n",
      "[proc 0][Train](171/100000) average pos_loss: 0.3333849310874939\n",
      "[proc 0][Train](171/100000) average neg_loss: 0.39153701066970825\n",
      "[proc 0][Train](171/100000) average loss: 0.3624609708786011\n",
      "[proc 0][Train](171/100000) average regularization: 0.015187813900411129\n",
      "[proc 0][Train] 1 steps take 1.297 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.081\n",
      "[proc 1][Train](170/100000) average pos_loss: 0.29870694875717163\n",
      "[proc 1][Train](170/100000) average neg_loss: 0.7079179883003235\n",
      "[proc 1][Train](170/100000) average loss: 0.5033124685287476\n",
      "[proc 1][Train](170/100000) average regularization: 0.015290994197130203\n",
      "[proc 1][Train] 1 steps take 1.277 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.200, backward: 0.003, update: 1.072\n",
      "[proc 0][Train](172/100000) average pos_loss: 0.28593066334724426\n",
      "[proc 0][Train](172/100000) average neg_loss: 0.7486518621444702\n",
      "[proc 0][Train](172/100000) average loss: 0.517291247844696\n",
      "[proc 0][Train](172/100000) average regularization: 0.015645885840058327\n",
      "[proc 0][Train] 1 steps take 1.312 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.097\n",
      "[proc 1][Train](171/100000) average pos_loss: 0.307550847530365\n",
      "[proc 1][Train](171/100000) average neg_loss: 0.38643792271614075\n",
      "[proc 1][Train](171/100000) average loss: 0.34699440002441406\n",
      "[proc 1][Train](171/100000) average regularization: 0.01548642199486494\n",
      "[proc 1][Train] 1 steps take 1.326 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.196, backward: 0.003, update: 1.126\n",
      "[proc 0][Train](173/100000) average pos_loss: 0.3552444279193878\n",
      "[proc 0][Train](173/100000) average neg_loss: 0.3682215213775635\n",
      "[proc 0][Train](173/100000) average loss: 0.36173295974731445\n",
      "[proc 0][Train](173/100000) average regularization: 0.014772605150938034\n",
      "[proc 0][Train] 1 steps take 1.288 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.073\n",
      "[proc 1][Train](172/100000) average pos_loss: 0.33416295051574707\n",
      "[proc 1][Train](172/100000) average neg_loss: 0.6431187987327576\n",
      "[proc 1][Train](172/100000) average loss: 0.4886408746242523\n",
      "[proc 1][Train](172/100000) average regularization: 0.014959022402763367\n",
      "[proc 1][Train] 1 steps take 1.337 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.119\n",
      "[proc 0][Train](174/100000) average pos_loss: 0.2929965853691101\n",
      "[proc 0][Train](174/100000) average neg_loss: 0.6883708834648132\n",
      "[proc 0][Train](174/100000) average loss: 0.49068373441696167\n",
      "[proc 0][Train](174/100000) average regularization: 0.015416246838867664\n",
      "[proc 0][Train] 1 steps take 1.320 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.105\n",
      "[proc 1][Train](173/100000) average pos_loss: 0.3066781163215637\n",
      "[proc 1][Train](173/100000) average neg_loss: 0.3969663381576538\n",
      "[proc 1][Train](173/100000) average loss: 0.35182222723960876\n",
      "[proc 1][Train](173/100000) average regularization: 0.015416218899190426\n",
      "[proc 1][Train] 1 steps take 1.298 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.078\n",
      "[proc 0][Train](175/100000) average pos_loss: 0.3321685791015625\n",
      "[proc 0][Train](175/100000) average neg_loss: 0.3977351784706116\n",
      "[proc 0][Train](175/100000) average loss: 0.36495187878608704\n",
      "[proc 0][Train](175/100000) average regularization: 0.014974048361182213\n",
      "[proc 0][Train] 1 steps take 1.295 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.080\n",
      "[proc 1][Train](174/100000) average pos_loss: 0.3085624575614929\n",
      "[proc 1][Train](174/100000) average neg_loss: 0.6833012104034424\n",
      "[proc 1][Train](174/100000) average loss: 0.49593183398246765\n",
      "[proc 1][Train](174/100000) average regularization: 0.015260577201843262\n",
      "[proc 1][Train] 1 steps take 1.348 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.265, backward: 0.003, update: 1.079\n",
      "[proc 0][Train](176/100000) average pos_loss: 0.2957414984703064\n",
      "[proc 0][Train](176/100000) average neg_loss: 0.6824169158935547\n",
      "[proc 0][Train](176/100000) average loss: 0.48907920718193054\n",
      "[proc 0][Train](176/100000) average regularization: 0.01550085935741663\n",
      "[proc 0][Train] 1 steps take 1.288 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.199, backward: 0.003, update: 1.085\n",
      "[proc 1][Train](175/100000) average pos_loss: 0.31653159856796265\n",
      "[proc 1][Train](175/100000) average neg_loss: 0.3579480051994324\n",
      "[proc 1][Train](175/100000) average loss: 0.3372398018836975\n",
      "[proc 1][Train](175/100000) average regularization: 0.015114068984985352\n",
      "[proc 1][Train] 1 steps take 1.320 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.218, backward: 0.003, update: 1.098\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](177/100000) average pos_loss: 0.347719669342041\n",
      "[proc 0][Train](177/100000) average neg_loss: 0.347974956035614\n",
      "[proc 0][Train](177/100000) average loss: 0.3478473126888275\n",
      "[proc 0][Train](177/100000) average regularization: 0.01496558915823698\n",
      "[proc 0][Train] 1 steps take 1.333 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.217, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](176/100000) average pos_loss: 0.3203223645687103\n",
      "[proc 1][Train](176/100000) average neg_loss: 0.6771261692047119\n",
      "[proc 1][Train](176/100000) average loss: 0.4987242817878723\n",
      "[proc 1][Train](176/100000) average regularization: 0.015268501825630665\n",
      "[proc 1][Train] 1 steps take 1.323 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.105\n",
      "[proc 0][Train](178/100000) average pos_loss: 0.2890990972518921\n",
      "[proc 0][Train](178/100000) average neg_loss: 0.7517251968383789\n",
      "[proc 0][Train](178/100000) average loss: 0.5204121470451355\n",
      "[proc 0][Train](178/100000) average regularization: 0.015449831262230873\n",
      "[proc 0][Train] 1 steps take 1.316 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.211, backward: 0.003, update: 1.086\n",
      "[proc 1][Train](177/100000) average pos_loss: 0.3061641454696655\n",
      "[proc 1][Train](177/100000) average neg_loss: 0.4463898539543152\n",
      "[proc 1][Train](177/100000) average loss: 0.37627699971199036\n",
      "[proc 1][Train](177/100000) average regularization: 0.015283056534826756\n",
      "[proc 1][Train] 1 steps take 1.350 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.213, backward: 0.003, update: 1.116\n",
      "[proc 0][Train](179/100000) average pos_loss: 0.3418027460575104\n",
      "[proc 0][Train](179/100000) average neg_loss: 0.38193580508232117\n",
      "[proc 0][Train](179/100000) average loss: 0.36186927556991577\n",
      "[proc 0][Train](179/100000) average regularization: 0.014731434173882008\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.200, backward: 0.003, update: 1.097\n",
      "[proc 1][Train](178/100000) average pos_loss: 0.3441806435585022\n",
      "[proc 1][Train](178/100000) average neg_loss: 0.6454733610153198\n",
      "[proc 1][Train](178/100000) average loss: 0.494827002286911\n",
      "[proc 1][Train](178/100000) average regularization: 0.014840853400528431\n",
      "[proc 1][Train] 1 steps take 1.334 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.212, backward: 0.003, update: 1.100\n",
      "[proc 0][Train](180/100000) average pos_loss: 0.30747997760772705\n",
      "[proc 0][Train](180/100000) average neg_loss: 0.6930729150772095\n",
      "[proc 0][Train](180/100000) average loss: 0.5002764463424683\n",
      "[proc 0][Train](180/100000) average regularization: 0.015173595398664474\n",
      "[proc 0][Train] 1 steps take 1.322 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.105\n",
      "[proc 1][Train](179/100000) average pos_loss: 0.30878376960754395\n",
      "[proc 1][Train](179/100000) average neg_loss: 0.38507720828056335\n",
      "[proc 1][Train](179/100000) average loss: 0.34693050384521484\n",
      "[proc 1][Train](179/100000) average regularization: 0.015119597315788269\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.091\n",
      "[proc 0][Train](181/100000) average pos_loss: 0.33462971448898315\n",
      "[proc 0][Train](181/100000) average neg_loss: 0.4119631052017212\n",
      "[proc 0][Train](181/100000) average loss: 0.3732964098453522\n",
      "[proc 0][Train](181/100000) average regularization: 0.014986619353294373\n",
      "[proc 0][Train] 1 steps take 1.295 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.195, backward: 0.002, update: 1.096\n",
      "[proc 1][Train](180/100000) average pos_loss: 0.30858558416366577\n",
      "[proc 1][Train](180/100000) average neg_loss: 0.6500968933105469\n",
      "[proc 1][Train](180/100000) average loss: 0.4793412387371063\n",
      "[proc 1][Train](180/100000) average regularization: 0.015262656845152378\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.091\n",
      "[proc 0][Train](182/100000) average pos_loss: 0.2953229248523712\n",
      "[proc 0][Train](182/100000) average neg_loss: 0.7294049263000488\n",
      "[proc 0][Train](182/100000) average loss: 0.5123639106750488\n",
      "[proc 0][Train](182/100000) average regularization: 0.015437109395861626\n",
      "[proc 0][Train] 1 steps take 1.333 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.113\n",
      "[proc 1][Train](181/100000) average pos_loss: 0.31604447960853577\n",
      "[proc 1][Train](181/100000) average neg_loss: 0.3771085739135742\n",
      "[proc 1][Train](181/100000) average loss: 0.3465765118598938\n",
      "[proc 1][Train](181/100000) average regularization: 0.015063566155731678\n",
      "[proc 1][Train] 1 steps take 1.312 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.215, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](183/100000) average pos_loss: 0.3637237250804901\n",
      "[proc 0][Train](183/100000) average neg_loss: 0.3333146870136261\n",
      "[proc 0][Train](183/100000) average loss: 0.3485192060470581\n",
      "[proc 0][Train](183/100000) average regularization: 0.014914941973984241\n",
      "[proc 0][Train] 1 steps take 1.268 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.193, backward: 0.002, update: 1.072\n",
      "[proc 1][Train](182/100000) average pos_loss: 0.3323146104812622\n",
      "[proc 1][Train](182/100000) average neg_loss: 0.6169211864471436\n",
      "[proc 1][Train](182/100000) average loss: 0.4746178984642029\n",
      "[proc 1][Train](182/100000) average regularization: 0.014910001307725906\n",
      "[proc 1][Train] 1 steps take 1.273 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.058\n",
      "[proc 0][Train](184/100000) average pos_loss: 0.2952112853527069\n",
      "[proc 0][Train](184/100000) average neg_loss: 0.695397675037384\n",
      "[proc 0][Train](184/100000) average loss: 0.4953044652938843\n",
      "[proc 0][Train](184/100000) average regularization: 0.015251598320901394\n",
      "[proc 0][Train] 1 steps take 1.323 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.107\n",
      "[proc 1][Train](183/100000) average pos_loss: 0.30159857869148254\n",
      "[proc 1][Train](183/100000) average neg_loss: 0.41400304436683655\n",
      "[proc 1][Train](183/100000) average loss: 0.35780081152915955\n",
      "[proc 1][Train](183/100000) average regularization: 0.015251467004418373\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.074\n",
      "[proc 0][Train](185/100000) average pos_loss: 0.32812976837158203\n",
      "[proc 0][Train](185/100000) average neg_loss: 0.4072014093399048\n",
      "[proc 0][Train](185/100000) average loss: 0.3676655888557434\n",
      "[proc 0][Train](185/100000) average regularization: 0.015040412545204163\n",
      "[proc 0][Train] 1 steps take 1.359 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.141\n",
      "[proc 1][Train](184/100000) average pos_loss: 0.3147808611392975\n",
      "[proc 1][Train](184/100000) average neg_loss: 0.6830759048461914\n",
      "[proc 1][Train](184/100000) average loss: 0.49892836809158325\n",
      "[proc 1][Train](184/100000) average regularization: 0.015102426521480083\n",
      "[proc 1][Train] 1 steps take 1.282 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.003, update: 1.071\n",
      "[proc 0][Train](186/100000) average pos_loss: 0.2990119457244873\n",
      "[proc 0][Train](186/100000) average neg_loss: 0.7020593881607056\n",
      "[proc 0][Train](186/100000) average loss: 0.5005356669425964\n",
      "[proc 0][Train](186/100000) average regularization: 0.015293187461793423\n",
      "[proc 0][Train] 1 steps take 1.324 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.107\n",
      "[proc 1][Train](185/100000) average pos_loss: 0.3320009112358093\n",
      "[proc 1][Train](185/100000) average neg_loss: 0.3880825638771057\n",
      "[proc 1][Train](185/100000) average loss: 0.3600417375564575\n",
      "[proc 1][Train](185/100000) average regularization: 0.01509244367480278\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.086\n",
      "[proc 0][Train](187/100000) average pos_loss: 0.35217076539993286\n",
      "[proc 0][Train](187/100000) average neg_loss: 0.34158456325531006\n",
      "[proc 0][Train](187/100000) average loss: 0.34687766432762146\n",
      "[proc 0][Train](187/100000) average regularization: 0.014789359644055367\n",
      "[proc 0][Train] 1 steps take 1.298 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.082\n",
      "[proc 1][Train](186/100000) average pos_loss: 0.32544881105422974\n",
      "[proc 1][Train](186/100000) average neg_loss: 0.6207593679428101\n",
      "[proc 1][Train](186/100000) average loss: 0.4731040894985199\n",
      "[proc 1][Train](186/100000) average regularization: 0.015108384191989899\n",
      "[proc 1][Train] 1 steps take 1.302 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.083\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](188/100000) average pos_loss: 0.2976091206073761\n",
      "[proc 0][Train](188/100000) average neg_loss: 0.7898744344711304\n",
      "[proc 0][Train](188/100000) average loss: 0.543741762638092\n",
      "[proc 0][Train](188/100000) average regularization: 0.015594803728163242\n",
      "[proc 0][Train] 1 steps take 1.312 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.092\n",
      "[proc 1][Train](187/100000) average pos_loss: 0.3047032356262207\n",
      "[proc 1][Train](187/100000) average neg_loss: 0.39086806774139404\n",
      "[proc 1][Train](187/100000) average loss: 0.3477856516838074\n",
      "[proc 1][Train](187/100000) average regularization: 0.015321481041610241\n",
      "[proc 1][Train] 1 steps take 1.293 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.077\n",
      "[proc 0][Train](189/100000) average pos_loss: 0.3434351682662964\n",
      "[proc 0][Train](189/100000) average neg_loss: 0.36010509729385376\n",
      "[proc 0][Train](189/100000) average loss: 0.3517701327800751\n",
      "[proc 0][Train](189/100000) average regularization: 0.014877794310450554\n",
      "[proc 0][Train] 1 steps take 1.291 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](188/100000) average pos_loss: 0.32409316301345825\n",
      "[proc 1][Train](188/100000) average neg_loss: 0.64137864112854\n",
      "[proc 1][Train](188/100000) average loss: 0.48273590207099915\n",
      "[proc 1][Train](188/100000) average regularization: 0.014969431795179844\n",
      "[proc 1][Train] 1 steps take 1.280 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.194, backward: 0.002, update: 1.083\n",
      "[proc 0][Train](190/100000) average pos_loss: 0.29727211594581604\n",
      "[proc 0][Train](190/100000) average neg_loss: 0.7197391390800476\n",
      "[proc 0][Train](190/100000) average loss: 0.508505642414093\n",
      "[proc 0][Train](190/100000) average regularization: 0.015357550233602524\n",
      "[proc 0][Train] 1 steps take 1.313 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](189/100000) average pos_loss: 0.30890119075775146\n",
      "[proc 1][Train](189/100000) average neg_loss: 0.397869348526001\n",
      "[proc 1][Train](189/100000) average loss: 0.3533852696418762\n",
      "[proc 1][Train](189/100000) average regularization: 0.015129760839045048\n",
      "[proc 1][Train] 1 steps take 1.329 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.113\n",
      "[proc 0][Train](191/100000) average pos_loss: 0.33285677433013916\n",
      "[proc 0][Train](191/100000) average neg_loss: 0.37321150302886963\n",
      "[proc 0][Train](191/100000) average loss: 0.3530341386795044\n",
      "[proc 0][Train](191/100000) average regularization: 0.01486206240952015\n",
      "[proc 0][Train] 1 steps take 1.285 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](190/100000) average pos_loss: 0.31674712896347046\n",
      "[proc 1][Train](190/100000) average neg_loss: 0.6457494497299194\n",
      "[proc 1][Train](190/100000) average loss: 0.48124828934669495\n",
      "[proc 1][Train](190/100000) average regularization: 0.01501811109483242\n",
      "[proc 1][Train] 1 steps take 1.375 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.159\n",
      "[proc 0][Train](192/100000) average pos_loss: 0.31116175651550293\n",
      "[proc 0][Train](192/100000) average neg_loss: 0.6571567058563232\n",
      "[proc 0][Train](192/100000) average loss: 0.4841592311859131\n",
      "[proc 0][Train](192/100000) average regularization: 0.01520910020917654\n",
      "[proc 0][Train] 1 steps take 1.303 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.211, backward: 0.003, update: 1.087\n",
      "[proc 1][Train](191/100000) average pos_loss: 0.3175276815891266\n",
      "[proc 1][Train](191/100000) average neg_loss: 0.3905121982097626\n",
      "[proc 1][Train](191/100000) average loss: 0.3540199398994446\n",
      "[proc 1][Train](191/100000) average regularization: 0.01516039576381445\n",
      "[proc 1][Train] 1 steps take 1.316 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.100\n",
      "[proc 0][Train](193/100000) average pos_loss: 0.35335516929626465\n",
      "[proc 0][Train](193/100000) average neg_loss: 0.3917955756187439\n",
      "[proc 0][Train](193/100000) average loss: 0.3725753724575043\n",
      "[proc 0][Train](193/100000) average regularization: 0.014833042398095131\n",
      "[proc 0][Train] 1 steps take 1.328 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.212, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](192/100000) average pos_loss: 0.3165048360824585\n",
      "[proc 1][Train](192/100000) average neg_loss: 0.6428446769714355\n",
      "[proc 1][Train](192/100000) average loss: 0.479674756526947\n",
      "[proc 1][Train](192/100000) average regularization: 0.015245174989104271\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.204, backward: 0.003, update: 1.094\n",
      "[proc 0][Train](194/100000) average pos_loss: 0.30177605152130127\n",
      "[proc 0][Train](194/100000) average neg_loss: 0.7018024921417236\n",
      "[proc 0][Train](194/100000) average loss: 0.5017892718315125\n",
      "[proc 0][Train](194/100000) average regularization: 0.015516421757638454\n",
      "[proc 0][Train] 1 steps take 1.324 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.214, backward: 0.003, update: 1.091\n",
      "[proc 1][Train](193/100000) average pos_loss: 0.3143273890018463\n",
      "[proc 1][Train](193/100000) average neg_loss: 0.3817266523838043\n",
      "[proc 1][Train](193/100000) average loss: 0.3480270206928253\n",
      "[proc 1][Train](193/100000) average regularization: 0.015229596756398678\n",
      "[proc 1][Train] 1 steps take 1.312 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.212, backward: 0.003, update: 1.082\n",
      "[proc 0][Train](195/100000) average pos_loss: 0.3390181362628937\n",
      "[proc 0][Train](195/100000) average neg_loss: 0.40258634090423584\n",
      "[proc 0][Train](195/100000) average loss: 0.37080222368240356\n",
      "[proc 0][Train](195/100000) average regularization: 0.014917279593646526\n",
      "[proc 0][Train] 1 steps take 1.330 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.002, update: 1.117\n",
      "[proc 1][Train](194/100000) average pos_loss: 0.321209579706192\n",
      "[proc 1][Train](194/100000) average neg_loss: 0.6551268100738525\n",
      "[proc 1][Train](194/100000) average loss: 0.4881681799888611\n",
      "[proc 1][Train](194/100000) average regularization: 0.015053720213472843\n",
      "[proc 1][Train] 1 steps take 1.302 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.204, backward: 0.003, update: 1.079\n",
      "[proc 0][Train](196/100000) average pos_loss: 0.29918116331100464\n",
      "[proc 0][Train](196/100000) average neg_loss: 0.6909192800521851\n",
      "[proc 0][Train](196/100000) average loss: 0.49505022168159485\n",
      "[proc 0][Train](196/100000) average regularization: 0.01527189090847969\n",
      "[proc 0][Train] 1 steps take 1.286 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.199, backward: 0.003, update: 1.083\n",
      "[proc 1][Train](195/100000) average pos_loss: 0.3249804377555847\n",
      "[proc 1][Train](195/100000) average neg_loss: 0.35971105098724365\n",
      "[proc 1][Train](195/100000) average loss: 0.3423457443714142\n",
      "[proc 1][Train](195/100000) average regularization: 0.015186958014965057\n",
      "[proc 1][Train] 1 steps take 1.314 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.095\n",
      "[proc 0][Train](197/100000) average pos_loss: 0.3501335382461548\n",
      "[proc 0][Train](197/100000) average neg_loss: 0.3446256220340729\n",
      "[proc 0][Train](197/100000) average loss: 0.34737956523895264\n",
      "[proc 0][Train](197/100000) average regularization: 0.015009963884949684\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](196/100000) average pos_loss: 0.3192473351955414\n",
      "[proc 1][Train](196/100000) average neg_loss: 0.622741162776947\n",
      "[proc 1][Train](196/100000) average loss: 0.470994234085083\n",
      "[proc 1][Train](196/100000) average regularization: 0.015163464471697807\n",
      "[proc 1][Train] 1 steps take 1.314 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.096\n",
      "[proc 0][Train](198/100000) average pos_loss: 0.2952284812927246\n",
      "[proc 0][Train](198/100000) average neg_loss: 0.7860488295555115\n",
      "[proc 0][Train](198/100000) average loss: 0.5406386852264404\n",
      "[proc 0][Train](198/100000) average regularization: 0.015540977008640766\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.082\n",
      "[proc 1][Train](197/100000) average pos_loss: 0.3129875063896179\n",
      "[proc 1][Train](197/100000) average neg_loss: 0.363280713558197\n",
      "[proc 1][Train](197/100000) average loss: 0.33813410997390747\n",
      "[proc 1][Train](197/100000) average regularization: 0.01531255804002285\n",
      "[proc 1][Train] 1 steps take 1.326 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.105\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](199/100000) average pos_loss: 0.3523728847503662\n",
      "[proc 0][Train](199/100000) average neg_loss: 0.3579504191875458\n",
      "[proc 0][Train](199/100000) average loss: 0.3551616668701172\n",
      "[proc 0][Train](199/100000) average regularization: 0.014761257916688919\n",
      "[proc 0][Train] 1 steps take 1.299 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.082\n",
      "[proc 1][Train](198/100000) average pos_loss: 0.32217490673065186\n",
      "[proc 1][Train](198/100000) average neg_loss: 0.6361186504364014\n",
      "[proc 1][Train](198/100000) average loss: 0.4791467785835266\n",
      "[proc 1][Train](198/100000) average regularization: 0.015116310678422451\n",
      "[proc 1][Train] 1 steps take 1.310 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](200/100000) average pos_loss: 0.2961369752883911\n",
      "[proc 0][Train](200/100000) average neg_loss: 0.6956909894943237\n",
      "[proc 0][Train](200/100000) average loss: 0.4959139823913574\n",
      "[proc 0][Train](200/100000) average regularization: 0.015409043990075588\n",
      "[proc 0][Train] 1 steps take 1.332 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.004, update: 1.109\n",
      "[proc 1][Train](199/100000) average pos_loss: 0.30609017610549927\n",
      "[proc 1][Train](199/100000) average neg_loss: 0.4113176465034485\n",
      "[proc 1][Train](199/100000) average loss: 0.3587039113044739\n",
      "[proc 1][Train](199/100000) average regularization: 0.015370002016425133\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.215, backward: 0.003, update: 1.074\n",
      "[proc 0][Train](201/100000) average pos_loss: 0.3490295112133026\n",
      "[proc 0][Train](201/100000) average neg_loss: 0.3549690544605255\n",
      "[proc 0][Train](201/100000) average loss: 0.35199928283691406\n",
      "[proc 0][Train](201/100000) average regularization: 0.01507525984197855\n",
      "[proc 0][Train] 1 steps take 1.285 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.214, backward: 0.004, update: 1.065\n",
      "[proc 1][Train](200/100000) average pos_loss: 0.331882119178772\n",
      "[proc 1][Train](200/100000) average neg_loss: 0.5946694612503052\n",
      "[proc 1][Train](200/100000) average loss: 0.4632757902145386\n",
      "[proc 1][Train](200/100000) average regularization: 0.015081487596035004\n",
      "[proc 1][Train] 1 steps take 1.286 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.070\n",
      "[proc 0][Train](202/100000) average pos_loss: 0.3015879988670349\n",
      "[proc 0][Train](202/100000) average neg_loss: 0.6669481992721558\n",
      "[proc 0][Train](202/100000) average loss: 0.48426809906959534\n",
      "[proc 0][Train](202/100000) average regularization: 0.015487887896597385\n",
      "[proc 0][Train] 1 steps take 1.294 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.004, update: 1.073\n",
      "[proc 1][Train](201/100000) average pos_loss: 0.3138974606990814\n",
      "[proc 1][Train](201/100000) average neg_loss: 0.3721908926963806\n",
      "[proc 1][Train](201/100000) average loss: 0.3430441617965698\n",
      "[proc 1][Train](201/100000) average regularization: 0.015489795245230198\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.005, update: 1.086\n",
      "[proc 0][Train](203/100000) average pos_loss: 0.3284820318222046\n",
      "[proc 0][Train](203/100000) average neg_loss: 0.33092790842056274\n",
      "[proc 0][Train](203/100000) average loss: 0.32970497012138367\n",
      "[proc 0][Train](203/100000) average regularization: 0.015431057661771774\n",
      "[proc 0][Train] 1 steps take 1.309 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.004, update: 1.092\n",
      "[proc 1][Train](202/100000) average pos_loss: 0.3187980651855469\n",
      "[proc 1][Train](202/100000) average neg_loss: 0.6233187913894653\n",
      "[proc 1][Train](202/100000) average loss: 0.4710584282875061\n",
      "[proc 1][Train](202/100000) average regularization: 0.015301492065191269\n",
      "[proc 1][Train] 1 steps take 1.302 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.202, backward: 0.003, update: 1.096\n",
      "[proc 0][Train](204/100000) average pos_loss: 0.29646027088165283\n",
      "[proc 0][Train](204/100000) average neg_loss: 0.7582480907440186\n",
      "[proc 0][Train](204/100000) average loss: 0.5273541808128357\n",
      "[proc 0][Train](204/100000) average regularization: 0.015627093613147736\n",
      "[proc 0][Train] 1 steps take 1.341 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.218, backward: 0.003, update: 1.119\n",
      "[proc 1][Train](203/100000) average pos_loss: 0.30806881189346313\n",
      "[proc 1][Train](203/100000) average neg_loss: 0.39122289419174194\n",
      "[proc 1][Train](203/100000) average loss: 0.34964585304260254\n",
      "[proc 1][Train](203/100000) average regularization: 0.015514206141233444\n",
      "[proc 1][Train] 1 steps take 1.299 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.079\n",
      "[proc 0][Train](205/100000) average pos_loss: 0.33782631158828735\n",
      "[proc 0][Train](205/100000) average neg_loss: 0.34722328186035156\n",
      "[proc 0][Train](205/100000) average loss: 0.34252479672431946\n",
      "[proc 0][Train](205/100000) average regularization: 0.015134375542402267\n",
      "[proc 0][Train] 1 steps take 1.358 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.141\n",
      "[proc 1][Train](204/100000) average pos_loss: 0.3226688504219055\n",
      "[proc 1][Train](204/100000) average neg_loss: 0.6082833409309387\n",
      "[proc 1][Train](204/100000) average loss: 0.4654760956764221\n",
      "[proc 1][Train](204/100000) average regularization: 0.01534825935959816\n",
      "[proc 1][Train] 1 steps take 1.282 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.063\n",
      "[proc 0][Train](206/100000) average pos_loss: 0.2933022379875183\n",
      "[proc 0][Train](206/100000) average neg_loss: 0.6955162286758423\n",
      "[proc 0][Train](206/100000) average loss: 0.4944092333316803\n",
      "[proc 0][Train](206/100000) average regularization: 0.015659788623452187\n",
      "[proc 0][Train] 1 steps take 1.286 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.069\n",
      "[proc 1][Train](205/100000) average pos_loss: 0.30575400590896606\n",
      "[proc 1][Train](205/100000) average neg_loss: 0.37076520919799805\n",
      "[proc 1][Train](205/100000) average loss: 0.33825960755348206\n",
      "[proc 1][Train](205/100000) average regularization: 0.015354057773947716\n",
      "[proc 1][Train] 1 steps take 1.322 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.104\n",
      "[proc 0][Train](207/100000) average pos_loss: 0.3372407555580139\n",
      "[proc 0][Train](207/100000) average neg_loss: 0.38422706723213196\n",
      "[proc 0][Train](207/100000) average loss: 0.36073392629623413\n",
      "[proc 0][Train](207/100000) average regularization: 0.015287059359252453\n",
      "[proc 0][Train] 1 steps take 1.315 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.004, update: 1.096\n",
      "[proc 1][Train](206/100000) average pos_loss: 0.31861037015914917\n",
      "[proc 1][Train](206/100000) average neg_loss: 0.671691358089447\n",
      "[proc 1][Train](206/100000) average loss: 0.4951508641242981\n",
      "[proc 1][Train](206/100000) average regularization: 0.015362945385277271\n",
      "[proc 1][Train] 1 steps take 1.299 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.084\n",
      "[proc 0][Train](208/100000) average pos_loss: 0.3057700991630554\n",
      "[proc 0][Train](208/100000) average neg_loss: 0.6993067860603333\n",
      "[proc 0][Train](208/100000) average loss: 0.5025384426116943\n",
      "[proc 0][Train](208/100000) average regularization: 0.01578737236559391\n",
      "[proc 0][Train] 1 steps take 1.394 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.222, backward: 0.003, update: 1.168\n",
      "[proc 1][Train](207/100000) average pos_loss: 0.3195706605911255\n",
      "[proc 1][Train](207/100000) average neg_loss: 0.3615519106388092\n",
      "[proc 1][Train](207/100000) average loss: 0.34056127071380615\n",
      "[proc 1][Train](207/100000) average regularization: 0.015410185791552067\n",
      "[proc 1][Train] 1 steps take 1.350 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.133\n",
      "[proc 0][Train](209/100000) average pos_loss: 0.34480488300323486\n",
      "[proc 0][Train](209/100000) average neg_loss: 0.34025436639785767\n",
      "[proc 0][Train](209/100000) average loss: 0.34252962470054626\n",
      "[proc 0][Train](209/100000) average regularization: 0.015132658183574677\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.206, backward: 0.003, update: 1.079\n",
      "[proc 1][Train](208/100000) average pos_loss: 0.3280620574951172\n",
      "[proc 1][Train](208/100000) average neg_loss: 0.6948050856590271\n",
      "[proc 1][Train](208/100000) average loss: 0.5114336013793945\n",
      "[proc 1][Train](208/100000) average regularization: 0.015174921602010727\n",
      "[proc 1][Train] 1 steps take 1.355 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.140\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](210/100000) average pos_loss: 0.30053073167800903\n",
      "[proc 0][Train](210/100000) average neg_loss: 0.7023389339447021\n",
      "[proc 0][Train](210/100000) average loss: 0.5014348030090332\n",
      "[proc 0][Train](210/100000) average regularization: 0.015444451943039894\n",
      "[proc 0][Train] 1 steps take 1.311 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.211, backward: 0.003, update: 1.081\n",
      "[proc 1][Train](209/100000) average pos_loss: 0.3151133358478546\n",
      "[proc 1][Train](209/100000) average neg_loss: 0.3698301315307617\n",
      "[proc 1][Train](209/100000) average loss: 0.342471718788147\n",
      "[proc 1][Train](209/100000) average regularization: 0.015364651568233967\n",
      "[proc 1][Train] 1 steps take 1.361 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.210, backward: 0.002, update: 1.132\n",
      "[proc 0][Train](211/100000) average pos_loss: 0.34284132719039917\n",
      "[proc 0][Train](211/100000) average neg_loss: 0.3604471981525421\n",
      "[proc 0][Train](211/100000) average loss: 0.35164427757263184\n",
      "[proc 0][Train](211/100000) average regularization: 0.015102052129805088\n",
      "[proc 0][Train] 1 steps take 1.307 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](210/100000) average pos_loss: 0.3304558992385864\n",
      "[proc 1][Train](210/100000) average neg_loss: 0.5727347135543823\n",
      "[proc 1][Train](210/100000) average loss: 0.4515953063964844\n",
      "[proc 1][Train](210/100000) average regularization: 0.015254800207912922\n",
      "[proc 1][Train] 1 steps take 1.298 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.216, backward: 0.003, update: 1.064\n",
      "[proc 0][Train](212/100000) average pos_loss: 0.303453266620636\n",
      "[proc 0][Train](212/100000) average neg_loss: 0.6859676837921143\n",
      "[proc 0][Train](212/100000) average loss: 0.4947104752063751\n",
      "[proc 0][Train](212/100000) average regularization: 0.015533152036368847\n",
      "[proc 0][Train] 1 steps take 1.284 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.068\n",
      "[proc 1][Train](211/100000) average pos_loss: 0.3069124221801758\n",
      "[proc 1][Train](211/100000) average neg_loss: 0.3917318880558014\n",
      "[proc 1][Train](211/100000) average loss: 0.3493221402168274\n",
      "[proc 1][Train](211/100000) average regularization: 0.015510131604969501\n",
      "[proc 1][Train] 1 steps take 1.313 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](213/100000) average pos_loss: 0.33418095111846924\n",
      "[proc 0][Train](213/100000) average neg_loss: 0.36089521646499634\n",
      "[proc 0][Train](213/100000) average loss: 0.3475380837917328\n",
      "[proc 0][Train](213/100000) average regularization: 0.01511487364768982\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.088\n",
      "[proc 1][Train](212/100000) average pos_loss: 0.3160504102706909\n",
      "[proc 1][Train](212/100000) average neg_loss: 0.6540003418922424\n",
      "[proc 1][Train](212/100000) average loss: 0.4850253760814667\n",
      "[proc 1][Train](212/100000) average regularization: 0.015553980134427547\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.218, backward: 0.003, update: 1.089\n",
      "[proc 0][Train](214/100000) average pos_loss: 0.29961147904396057\n",
      "[proc 0][Train](214/100000) average neg_loss: 0.6798146963119507\n",
      "[proc 0][Train](214/100000) average loss: 0.48971307277679443\n",
      "[proc 0][Train](214/100000) average regularization: 0.015669550746679306\n",
      "[proc 0][Train] 1 steps take 1.334 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.117\n",
      "[proc 1][Train](213/100000) average pos_loss: 0.31933796405792236\n",
      "[proc 1][Train](213/100000) average neg_loss: 0.38213878870010376\n",
      "[proc 1][Train](213/100000) average loss: 0.35073837637901306\n",
      "[proc 1][Train](213/100000) average regularization: 0.015356198884546757\n",
      "[proc 1][Train] 1 steps take 1.329 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.111\n",
      "[proc 0][Train](215/100000) average pos_loss: 0.34252965450286865\n",
      "[proc 0][Train](215/100000) average neg_loss: 0.3630914092063904\n",
      "[proc 0][Train](215/100000) average loss: 0.3528105318546295\n",
      "[proc 0][Train](215/100000) average regularization: 0.015239113941788673\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](214/100000) average pos_loss: 0.3264056444168091\n",
      "[proc 1][Train](214/100000) average neg_loss: 0.689172089099884\n",
      "[proc 1][Train](214/100000) average loss: 0.507788896560669\n",
      "[proc 1][Train](214/100000) average regularization: 0.015322169288992882\n",
      "[proc 1][Train] 1 steps take 1.308 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.089\n",
      "[proc 0][Train](216/100000) average pos_loss: 0.30638301372528076\n",
      "[proc 0][Train](216/100000) average neg_loss: 0.7021240592002869\n",
      "[proc 0][Train](216/100000) average loss: 0.5042535066604614\n",
      "[proc 0][Train](216/100000) average regularization: 0.015539905987679958\n",
      "[proc 0][Train] 1 steps take 1.310 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.095\n",
      "[proc 1][Train](215/100000) average pos_loss: 0.31953513622283936\n",
      "[proc 1][Train](215/100000) average neg_loss: 0.3756985366344452\n",
      "[proc 1][Train](215/100000) average loss: 0.34761685132980347\n",
      "[proc 1][Train](215/100000) average regularization: 0.015245559625327587\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.082\n",
      "[proc 0][Train](217/100000) average pos_loss: 0.34909528493881226\n",
      "[proc 0][Train](217/100000) average neg_loss: 0.3143810033798218\n",
      "[proc 0][Train](217/100000) average loss: 0.331738144159317\n",
      "[proc 0][Train](217/100000) average regularization: 0.015049345791339874\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.083\n",
      "[proc 1][Train](216/100000) average pos_loss: 0.33170077204704285\n",
      "[proc 1][Train](216/100000) average neg_loss: 0.5918803811073303\n",
      "[proc 1][Train](216/100000) average loss: 0.4617905616760254\n",
      "[proc 1][Train](216/100000) average regularization: 0.015219761058688164\n",
      "[proc 1][Train] 1 steps take 1.291 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.076\n",
      "[proc 0][Train](218/100000) average pos_loss: 0.302237868309021\n",
      "[proc 0][Train](218/100000) average neg_loss: 0.6890023946762085\n",
      "[proc 0][Train](218/100000) average loss: 0.49562013149261475\n",
      "[proc 0][Train](218/100000) average regularization: 0.015573849901556969\n",
      "[proc 0][Train] 1 steps take 1.325 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.107\n",
      "[proc 1][Train](217/100000) average pos_loss: 0.30761048197746277\n",
      "[proc 1][Train](217/100000) average neg_loss: 0.3929741680622101\n",
      "[proc 1][Train](217/100000) average loss: 0.3502923250198364\n",
      "[proc 1][Train](217/100000) average regularization: 0.015467356890439987\n",
      "[proc 1][Train] 1 steps take 1.313 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.097\n",
      "[proc 0][Train](219/100000) average pos_loss: 0.3307010531425476\n",
      "[proc 0][Train](219/100000) average neg_loss: 0.3580546975135803\n",
      "[proc 0][Train](219/100000) average loss: 0.34437787532806396\n",
      "[proc 0][Train](219/100000) average regularization: 0.015324583277106285\n",
      "[proc 0][Train] 1 steps take 1.348 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.252, backward: 0.003, update: 1.091\n",
      "[proc 1][Train](218/100000) average pos_loss: 0.3245703876018524\n",
      "[proc 1][Train](218/100000) average neg_loss: 0.6754382848739624\n",
      "[proc 1][Train](218/100000) average loss: 0.5000043511390686\n",
      "[proc 1][Train](218/100000) average regularization: 0.015369963832199574\n",
      "[proc 1][Train] 1 steps take 1.325 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.108\n",
      "[proc 0][Train](220/100000) average pos_loss: 0.3055671453475952\n",
      "[proc 0][Train](220/100000) average neg_loss: 0.6665273904800415\n",
      "[proc 0][Train](220/100000) average loss: 0.48604726791381836\n",
      "[proc 0][Train](220/100000) average regularization: 0.01571795716881752\n",
      "[proc 0][Train] 1 steps take 1.299 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.081\n",
      "[proc 1][Train](219/100000) average pos_loss: 0.33045655488967896\n",
      "[proc 1][Train](219/100000) average neg_loss: 0.3675480782985687\n",
      "[proc 1][Train](219/100000) average loss: 0.34900230169296265\n",
      "[proc 1][Train](219/100000) average regularization: 0.015220768749713898\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.091\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](221/100000) average pos_loss: 0.34439441561698914\n",
      "[proc 0][Train](221/100000) average neg_loss: 0.3574226498603821\n",
      "[proc 0][Train](221/100000) average loss: 0.3509085178375244\n",
      "[proc 0][Train](221/100000) average regularization: 0.015245545655488968\n",
      "[proc 0][Train] 1 steps take 1.285 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.066\n",
      "[proc 1][Train](220/100000) average pos_loss: 0.32524073123931885\n",
      "[proc 1][Train](220/100000) average neg_loss: 0.6383613348007202\n",
      "[proc 1][Train](220/100000) average loss: 0.48180103302001953\n",
      "[proc 1][Train](220/100000) average regularization: 0.0153748644515872\n",
      "[proc 1][Train] 1 steps take 1.300 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.199, backward: 0.003, update: 1.096\n",
      "[proc 0][Train](222/100000) average pos_loss: 0.30262717604637146\n",
      "[proc 0][Train](222/100000) average neg_loss: 0.6911796927452087\n",
      "[proc 0][Train](222/100000) average loss: 0.4969034194946289\n",
      "[proc 0][Train](222/100000) average regularization: 0.015656404197216034\n",
      "[proc 0][Train] 1 steps take 1.329 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.201, backward: 0.003, update: 1.123\n",
      "[proc 1][Train](221/100000) average pos_loss: 0.31534886360168457\n",
      "[proc 1][Train](221/100000) average neg_loss: 0.41147392988204956\n",
      "[proc 1][Train](221/100000) average loss: 0.36341139674186707\n",
      "[proc 1][Train](221/100000) average regularization: 0.015400264412164688\n",
      "[proc 1][Train] 1 steps take 1.305 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.091\n",
      "[proc 0][Train](223/100000) average pos_loss: 0.35129639506340027\n",
      "[proc 0][Train](223/100000) average neg_loss: 0.3349918723106384\n",
      "[proc 0][Train](223/100000) average loss: 0.34314411878585815\n",
      "[proc 0][Train](223/100000) average regularization: 0.015266811475157738\n",
      "[proc 0][Train] 1 steps take 1.282 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](222/100000) average pos_loss: 0.33172520995140076\n",
      "[proc 1][Train](222/100000) average neg_loss: 0.6340485215187073\n",
      "[proc 1][Train](222/100000) average loss: 0.4828868508338928\n",
      "[proc 1][Train](222/100000) average regularization: 0.015280677936971188\n",
      "[proc 1][Train] 1 steps take 1.289 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.208, backward: 0.003, update: 1.077\n",
      "[proc 0][Train](224/100000) average pos_loss: 0.30591410398483276\n",
      "[proc 0][Train](224/100000) average neg_loss: 0.6677868366241455\n",
      "[proc 0][Train](224/100000) average loss: 0.48685047030448914\n",
      "[proc 0][Train](224/100000) average regularization: 0.015395264141261578\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.083\n",
      "[proc 1][Train](223/100000) average pos_loss: 0.3148297071456909\n",
      "[proc 1][Train](223/100000) average neg_loss: 0.36120322346687317\n",
      "[proc 1][Train](223/100000) average loss: 0.33801645040512085\n",
      "[proc 1][Train](223/100000) average regularization: 0.015441895462572575\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.197, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](225/100000) average pos_loss: 0.3410758078098297\n",
      "[proc 0][Train](225/100000) average neg_loss: 0.3714476525783539\n",
      "[proc 0][Train](225/100000) average loss: 0.3562617301940918\n",
      "[proc 0][Train](225/100000) average regularization: 0.015274973586201668\n",
      "[proc 0][Train] 1 steps take 1.298 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.208, backward: 0.003, update: 1.070\n",
      "[proc 1][Train](224/100000) average pos_loss: 0.3188251852989197\n",
      "[proc 1][Train](224/100000) average neg_loss: 0.6524176597595215\n",
      "[proc 1][Train](224/100000) average loss: 0.4856214225292206\n",
      "[proc 1][Train](224/100000) average regularization: 0.0154190082103014\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.079\n",
      "[proc 0][Train](226/100000) average pos_loss: 0.30010515451431274\n",
      "[proc 0][Train](226/100000) average neg_loss: 0.7497737407684326\n",
      "[proc 0][Train](226/100000) average loss: 0.5249394178390503\n",
      "[proc 0][Train](226/100000) average regularization: 0.01559445634484291\n",
      "[proc 0][Train] 1 steps take 1.290 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.209, backward: 0.003, update: 1.063\n",
      "[proc 1][Train](225/100000) average pos_loss: 0.32187849283218384\n",
      "[proc 1][Train](225/100000) average neg_loss: 0.36128857731819153\n",
      "[proc 1][Train](225/100000) average loss: 0.3415835499763489\n",
      "[proc 1][Train](225/100000) average regularization: 0.015513322316110134\n",
      "[proc 1][Train] 1 steps take 1.312 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.214, backward: 0.003, update: 1.077\n",
      "[proc 0][Train](227/100000) average pos_loss: 0.3546108305454254\n",
      "[proc 0][Train](227/100000) average neg_loss: 0.33639025688171387\n",
      "[proc 0][Train](227/100000) average loss: 0.34550052881240845\n",
      "[proc 0][Train](227/100000) average regularization: 0.015110126696527004\n",
      "[proc 0][Train] 1 steps take 1.275 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.199, backward: 0.004, update: 1.071\n",
      "[proc 1][Train](226/100000) average pos_loss: 0.3359788954257965\n",
      "[proc 1][Train](226/100000) average neg_loss: 0.6486777067184448\n",
      "[proc 1][Train](226/100000) average loss: 0.4923282861709595\n",
      "[proc 1][Train](226/100000) average regularization: 0.01521369069814682\n",
      "[proc 1][Train] 1 steps take 1.325 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.219, backward: 0.003, update: 1.087\n",
      "[proc 0][Train](228/100000) average pos_loss: 0.301302969455719\n",
      "[proc 0][Train](228/100000) average neg_loss: 0.7032146453857422\n",
      "[proc 0][Train](228/100000) average loss: 0.5022587776184082\n",
      "[proc 0][Train](228/100000) average regularization: 0.015522025525569916\n",
      "[proc 0][Train] 1 steps take 1.292 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.074\n",
      "[proc 1][Train](227/100000) average pos_loss: 0.31202077865600586\n",
      "[proc 1][Train](227/100000) average neg_loss: 0.3590775728225708\n",
      "[proc 1][Train](227/100000) average loss: 0.33554917573928833\n",
      "[proc 1][Train](227/100000) average regularization: 0.015364655293524265\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.083\n",
      "[proc 0][Train](229/100000) average pos_loss: 0.3480742275714874\n",
      "[proc 0][Train](229/100000) average neg_loss: 0.341631144285202\n",
      "[proc 0][Train](229/100000) average loss: 0.3448526859283447\n",
      "[proc 0][Train](229/100000) average regularization: 0.015050224959850311\n",
      "[proc 0][Train] 1 steps take 1.275 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.194, backward: 0.002, update: 1.077\n",
      "[proc 1][Train](228/100000) average pos_loss: 0.3263677656650543\n",
      "[proc 1][Train](228/100000) average neg_loss: 0.6510063409805298\n",
      "[proc 1][Train](228/100000) average loss: 0.48868703842163086\n",
      "[proc 1][Train](228/100000) average regularization: 0.015208789147436619\n",
      "[proc 1][Train] 1 steps take 1.275 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.059\n",
      "[proc 0][Train](230/100000) average pos_loss: 0.3072734475135803\n",
      "[proc 0][Train](230/100000) average neg_loss: 0.6781185865402222\n",
      "[proc 0][Train](230/100000) average loss: 0.49269601702690125\n",
      "[proc 0][Train](230/100000) average regularization: 0.015519051812589169\n",
      "[proc 0][Train] 1 steps take 1.336 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.194, backward: 0.002, update: 1.138\n",
      "[proc 1][Train](229/100000) average pos_loss: 0.3168547749519348\n",
      "[proc 1][Train](229/100000) average neg_loss: 0.3509685695171356\n",
      "[proc 1][Train](229/100000) average loss: 0.333911657333374\n",
      "[proc 1][Train](229/100000) average regularization: 0.015347064472734928\n",
      "[proc 1][Train] 1 steps take 1.316 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.098\n",
      "[proc 0][Train](231/100000) average pos_loss: 0.34847110509872437\n",
      "[proc 0][Train](231/100000) average neg_loss: 0.35929256677627563\n",
      "[proc 0][Train](231/100000) average loss: 0.3538818359375\n",
      "[proc 0][Train](231/100000) average regularization: 0.015183264389634132\n",
      "[proc 0][Train] 1 steps take 1.279 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](230/100000) average pos_loss: 0.32129132747650146\n",
      "[proc 1][Train](230/100000) average neg_loss: 0.5978530049324036\n",
      "[proc 1][Train](230/100000) average loss: 0.4595721662044525\n",
      "[proc 1][Train](230/100000) average regularization: 0.015414766035974026\n",
      "[proc 1][Train] 1 steps take 1.309 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.092\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](232/100000) average pos_loss: 0.29202574491500854\n",
      "[proc 0][Train](232/100000) average neg_loss: 0.6949150562286377\n",
      "[proc 0][Train](232/100000) average loss: 0.4934704005718231\n",
      "[proc 0][Train](232/100000) average regularization: 0.015591820701956749\n",
      "[proc 0][Train] 1 steps take 1.312 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](231/100000) average pos_loss: 0.3106161952018738\n",
      "[proc 1][Train](231/100000) average neg_loss: 0.38122767210006714\n",
      "[proc 1][Train](231/100000) average loss: 0.34592193365097046\n",
      "[proc 1][Train](231/100000) average regularization: 0.015620836056768894\n",
      "[proc 1][Train] 1 steps take 1.278 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.060\n",
      "[proc 0][Train](233/100000) average pos_loss: 0.3491235077381134\n",
      "[proc 0][Train](233/100000) average neg_loss: 0.3277215361595154\n",
      "[proc 0][Train](233/100000) average loss: 0.3384225368499756\n",
      "[proc 0][Train](233/100000) average regularization: 0.015365601517260075\n",
      "[proc 0][Train] 1 steps take 1.303 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.086\n",
      "[proc 1][Train](232/100000) average pos_loss: 0.3353617787361145\n",
      "[proc 1][Train](232/100000) average neg_loss: 0.6139624714851379\n",
      "[proc 1][Train](232/100000) average loss: 0.4746621251106262\n",
      "[proc 1][Train](232/100000) average regularization: 0.015305989421904087\n",
      "[proc 1][Train] 1 steps take 1.302 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.087\n",
      "[proc 0][Train](234/100000) average pos_loss: 0.30352360010147095\n",
      "[proc 0][Train](234/100000) average neg_loss: 0.7781400680541992\n",
      "[proc 0][Train](234/100000) average loss: 0.5408318042755127\n",
      "[proc 0][Train](234/100000) average regularization: 0.015652870759367943\n",
      "[proc 0][Train] 1 steps take 1.319 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.103\n",
      "[proc 1][Train](233/100000) average pos_loss: 0.3135688900947571\n",
      "[proc 1][Train](233/100000) average neg_loss: 0.37617242336273193\n",
      "[proc 1][Train](233/100000) average loss: 0.3448706567287445\n",
      "[proc 1][Train](233/100000) average regularization: 0.015402108430862427\n",
      "[proc 1][Train] 1 steps take 1.313 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](235/100000) average pos_loss: 0.34488558769226074\n",
      "[proc 0][Train](235/100000) average neg_loss: 0.3395845293998718\n",
      "[proc 0][Train](235/100000) average loss: 0.3422350585460663\n",
      "[proc 0][Train](235/100000) average regularization: 0.015185355208814144\n",
      "[proc 0][Train] 1 steps take 1.307 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.215, backward: 0.003, update: 1.087\n",
      "[proc 1][Train](234/100000) average pos_loss: 0.32508230209350586\n",
      "[proc 1][Train](234/100000) average neg_loss: 0.6386675834655762\n",
      "[proc 1][Train](234/100000) average loss: 0.481874942779541\n",
      "[proc 1][Train](234/100000) average regularization: 0.015261354856193066\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.091\n",
      "[proc 0][Train](236/100000) average pos_loss: 0.29983633756637573\n",
      "[proc 0][Train](236/100000) average neg_loss: 0.7414453029632568\n",
      "[proc 0][Train](236/100000) average loss: 0.5206408500671387\n",
      "[proc 0][Train](236/100000) average regularization: 0.015510816127061844\n",
      "[proc 0][Train] 1 steps take 1.298 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.079\n",
      "[proc 1][Train](235/100000) average pos_loss: 0.32201045751571655\n",
      "[proc 1][Train](235/100000) average neg_loss: 0.35165005922317505\n",
      "[proc 1][Train](235/100000) average loss: 0.3368302583694458\n",
      "[proc 1][Train](235/100000) average regularization: 0.015352263115346432\n",
      "[proc 1][Train] 1 steps take 1.316 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.097\n",
      "[proc 0][Train](237/100000) average pos_loss: 0.36062848567962646\n",
      "[proc 0][Train](237/100000) average neg_loss: 0.3471266031265259\n",
      "[proc 0][Train](237/100000) average loss: 0.35387754440307617\n",
      "[proc 0][Train](237/100000) average regularization: 0.014985859394073486\n",
      "[proc 0][Train] 1 steps take 1.308 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.093\n",
      "[proc 1][Train](236/100000) average pos_loss: 0.3383157253265381\n",
      "[proc 1][Train](236/100000) average neg_loss: 0.6762639880180359\n",
      "[proc 1][Train](236/100000) average loss: 0.5072898864746094\n",
      "[proc 1][Train](236/100000) average regularization: 0.014947313815355301\n",
      "[proc 1][Train] 1 steps take 1.336 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.116\n",
      "[proc 0][Train](238/100000) average pos_loss: 0.3016490340232849\n",
      "[proc 0][Train](238/100000) average neg_loss: 0.663417637348175\n",
      "[proc 0][Train](238/100000) average loss: 0.48253333568573\n",
      "[proc 0][Train](238/100000) average regularization: 0.01551807951182127\n",
      "[proc 0][Train] 1 steps take 1.320 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.103\n",
      "[proc 1][Train](237/100000) average pos_loss: 0.3302375078201294\n",
      "[proc 1][Train](237/100000) average neg_loss: 0.32693690061569214\n",
      "[proc 1][Train](237/100000) average loss: 0.32858720421791077\n",
      "[proc 1][Train](237/100000) average regularization: 0.015121787786483765\n",
      "[proc 1][Train] 1 steps take 1.300 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.078\n",
      "[proc 0][Train](239/100000) average pos_loss: 0.3420361876487732\n",
      "[proc 0][Train](239/100000) average neg_loss: 0.34399688243865967\n",
      "[proc 0][Train](239/100000) average loss: 0.34301653504371643\n",
      "[proc 0][Train](239/100000) average regularization: 0.015037461183965206\n",
      "[proc 0][Train] 1 steps take 1.289 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.197, backward: 0.003, update: 1.088\n",
      "[proc 1][Train](238/100000) average pos_loss: 0.32521381974220276\n",
      "[proc 1][Train](238/100000) average neg_loss: 0.6391367316246033\n",
      "[proc 1][Train](238/100000) average loss: 0.4821752905845642\n",
      "[proc 1][Train](238/100000) average regularization: 0.015182897448539734\n",
      "[proc 1][Train] 1 steps take 1.354 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.226, backward: 0.003, update: 1.124\n",
      "[proc 0][Train](240/100000) average pos_loss: 0.28735119104385376\n",
      "[proc 0][Train](240/100000) average neg_loss: 0.7404410243034363\n",
      "[proc 0][Train](240/100000) average loss: 0.513896107673645\n",
      "[proc 0][Train](240/100000) average regularization: 0.015595873817801476\n",
      "[proc 0][Train] 1 steps take 1.319 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.100\n",
      "[proc 1][Train](239/100000) average pos_loss: 0.31231749057769775\n",
      "[proc 1][Train](239/100000) average neg_loss: 0.38198140263557434\n",
      "[proc 1][Train](239/100000) average loss: 0.34714943170547485\n",
      "[proc 1][Train](239/100000) average regularization: 0.015350934118032455\n",
      "[proc 1][Train] 1 steps take 1.282 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.063\n",
      "[proc 0][Train](241/100000) average pos_loss: 0.34796345233917236\n",
      "[proc 0][Train](241/100000) average neg_loss: 0.3281790614128113\n",
      "[proc 0][Train](241/100000) average loss: 0.3380712568759918\n",
      "[proc 0][Train](241/100000) average regularization: 0.014996548183262348\n",
      "[proc 0][Train] 1 steps take 1.315 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.210, backward: 0.003, update: 1.087\n",
      "[proc 1][Train](240/100000) average pos_loss: 0.3392857313156128\n",
      "[proc 1][Train](240/100000) average neg_loss: 0.6274946331977844\n",
      "[proc 1][Train](240/100000) average loss: 0.4833901822566986\n",
      "[proc 1][Train](240/100000) average regularization: 0.015191511251032352\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.086\n",
      "[proc 0][Train](242/100000) average pos_loss: 0.3050646483898163\n",
      "[proc 0][Train](242/100000) average neg_loss: 0.7342342138290405\n",
      "[proc 0][Train](242/100000) average loss: 0.5196494460105896\n",
      "[proc 0][Train](242/100000) average regularization: 0.015540619380772114\n",
      "[proc 0][Train] 1 steps take 1.396 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.225, backward: 0.004, update: 1.150\n",
      "[proc 1][Train](241/100000) average pos_loss: 0.3166915774345398\n",
      "[proc 1][Train](241/100000) average neg_loss: 0.36186105012893677\n",
      "[proc 1][Train](241/100000) average loss: 0.3392763137817383\n",
      "[proc 1][Train](241/100000) average regularization: 0.015483559109270573\n",
      "[proc 1][Train] 1 steps take 1.345 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.213, backward: 0.003, update: 1.110\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](243/100000) average pos_loss: 0.3584411144256592\n",
      "[proc 0][Train](243/100000) average neg_loss: 0.3430354595184326\n",
      "[proc 0][Train](243/100000) average loss: 0.3507382869720459\n",
      "[proc 0][Train](243/100000) average regularization: 0.01501588337123394\n",
      "[proc 0][Train] 1 steps take 1.310 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.091\n",
      "[proc 1][Train](242/100000) average pos_loss: 0.3377494215965271\n",
      "[proc 1][Train](242/100000) average neg_loss: 0.619819164276123\n",
      "[proc 1][Train](242/100000) average loss: 0.4787842929363251\n",
      "[proc 1][Train](242/100000) average regularization: 0.015129880979657173\n",
      "[proc 1][Train] 1 steps take 1.331 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.213, backward: 0.003, update: 1.100\n",
      "[proc 0][Train](244/100000) average pos_loss: 0.3017835319042206\n",
      "[proc 0][Train](244/100000) average neg_loss: 0.6820340156555176\n",
      "[proc 0][Train](244/100000) average loss: 0.4919087886810303\n",
      "[proc 0][Train](244/100000) average regularization: 0.01541813649237156\n",
      "[proc 0][Train] 1 steps take 1.328 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.113\n",
      "[proc 1][Train](243/100000) average pos_loss: 0.3238510489463806\n",
      "[proc 1][Train](243/100000) average neg_loss: 0.3639390468597412\n",
      "[proc 1][Train](243/100000) average loss: 0.3438950479030609\n",
      "[proc 1][Train](243/100000) average regularization: 0.01530169416218996\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](245/100000) average pos_loss: 0.3519475758075714\n",
      "[proc 0][Train](245/100000) average neg_loss: 0.34219443798065186\n",
      "[proc 0][Train](245/100000) average loss: 0.34707099199295044\n",
      "[proc 0][Train](245/100000) average regularization: 0.015106488950550556\n",
      "[proc 0][Train] 1 steps take 1.423 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.206\n",
      "[proc 1][Train](244/100000) average pos_loss: 0.3332720994949341\n",
      "[proc 1][Train](244/100000) average neg_loss: 0.6255243420600891\n",
      "[proc 1][Train](244/100000) average loss: 0.4793982207775116\n",
      "[proc 1][Train](244/100000) average regularization: 0.015283606015145779\n",
      "[proc 1][Train] 1 steps take 1.324 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.110\n",
      "[proc 0][Train](246/100000) average pos_loss: 0.30825626850128174\n",
      "[proc 0][Train](246/100000) average neg_loss: 0.7128037214279175\n",
      "[proc 0][Train](246/100000) average loss: 0.5105299949645996\n",
      "[proc 0][Train](246/100000) average regularization: 0.015445247292518616\n",
      "[proc 0][Train] 1 steps take 1.314 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.107\n",
      "[proc 1][Train](245/100000) average pos_loss: 0.3203652799129486\n",
      "[proc 1][Train](245/100000) average neg_loss: 0.3676280081272125\n",
      "[proc 1][Train](245/100000) average loss: 0.34399664402008057\n",
      "[proc 1][Train](245/100000) average regularization: 0.015327456407248974\n",
      "[proc 1][Train] 1 steps take 1.324 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.104\n",
      "[proc 0][Train](247/100000) average pos_loss: 0.35335177183151245\n",
      "[proc 0][Train](247/100000) average neg_loss: 0.3467704653739929\n",
      "[proc 0][Train](247/100000) average loss: 0.3500611186027527\n",
      "[proc 0][Train](247/100000) average regularization: 0.015015196986496449\n",
      "[proc 0][Train] 1 steps take 1.274 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.056\n",
      "[proc 1][Train](246/100000) average pos_loss: 0.3257865309715271\n",
      "[proc 1][Train](246/100000) average neg_loss: 0.5858262777328491\n",
      "[proc 1][Train](246/100000) average loss: 0.4558064043521881\n",
      "[proc 1][Train](246/100000) average regularization: 0.015159626491367817\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.212, backward: 0.002, update: 1.096\n",
      "[proc 0][Train](248/100000) average pos_loss: 0.3021835684776306\n",
      "[proc 0][Train](248/100000) average neg_loss: 0.6689414381980896\n",
      "[proc 0][Train](248/100000) average loss: 0.4855625033378601\n",
      "[proc 0][Train](248/100000) average regularization: 0.015482270158827305\n",
      "[proc 0][Train] 1 steps take 1.274 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.206, backward: 0.003, update: 1.063\n",
      "[proc 1][Train](247/100000) average pos_loss: 0.31450390815734863\n",
      "[proc 1][Train](247/100000) average neg_loss: 0.3806423842906952\n",
      "[proc 1][Train](247/100000) average loss: 0.3475731611251831\n",
      "[proc 1][Train](247/100000) average regularization: 0.015396893955767155\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.082\n",
      "[proc 0][Train](249/100000) average pos_loss: 0.3367496132850647\n",
      "[proc 0][Train](249/100000) average neg_loss: 0.3480880856513977\n",
      "[proc 0][Train](249/100000) average loss: 0.3424188494682312\n",
      "[proc 0][Train](249/100000) average regularization: 0.01532635185867548\n",
      "[proc 0][Train] 1 steps take 1.313 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](248/100000) average pos_loss: 0.3211578130722046\n",
      "[proc 1][Train](248/100000) average neg_loss: 0.635310173034668\n",
      "[proc 1][Train](248/100000) average loss: 0.4782339930534363\n",
      "[proc 1][Train](248/100000) average regularization: 0.0154037456959486\n",
      "[proc 1][Train] 1 steps take 1.306 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.089\n",
      "[proc 0][Train](250/100000) average pos_loss: 0.30789250135421753\n",
      "[proc 0][Train](250/100000) average neg_loss: 0.7827944755554199\n",
      "[proc 0][Train](250/100000) average loss: 0.5453435182571411\n",
      "[proc 0][Train](250/100000) average regularization: 0.015531533397734165\n",
      "[proc 0][Train] 1 steps take 1.277 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.063\n",
      "[proc 1][Train](249/100000) average pos_loss: 0.31784629821777344\n",
      "[proc 1][Train](249/100000) average neg_loss: 0.3347976803779602\n",
      "[proc 1][Train](249/100000) average loss: 0.3263219892978668\n",
      "[proc 1][Train](249/100000) average regularization: 0.015506603755056858\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](251/100000) average pos_loss: 0.3651110827922821\n",
      "[proc 0][Train](251/100000) average neg_loss: 0.32780131697654724\n",
      "[proc 0][Train](251/100000) average loss: 0.3464561998844147\n",
      "[proc 0][Train](251/100000) average regularization: 0.015046331100165844\n",
      "[proc 0][Train] 1 steps take 1.298 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.080\n",
      "[proc 1][Train](250/100000) average pos_loss: 0.3369087874889374\n",
      "[proc 1][Train](250/100000) average neg_loss: 0.6058769226074219\n",
      "[proc 1][Train](250/100000) average loss: 0.4713928699493408\n",
      "[proc 1][Train](250/100000) average regularization: 0.015207014046609402\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.202, backward: 0.003, update: 1.097\n",
      "[proc 0][Train](252/100000) average pos_loss: 0.299506276845932\n",
      "[proc 0][Train](252/100000) average neg_loss: 0.7152562737464905\n",
      "[proc 0][Train](252/100000) average loss: 0.50738126039505\n",
      "[proc 0][Train](252/100000) average regularization: 0.015519960783421993\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.091\n",
      "[proc 1][Train](251/100000) average pos_loss: 0.3155798614025116\n",
      "[proc 1][Train](251/100000) average neg_loss: 0.37268227338790894\n",
      "[proc 1][Train](251/100000) average loss: 0.3441310524940491\n",
      "[proc 1][Train](251/100000) average regularization: 0.015377961099147797\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.082\n",
      "[proc 0][Train](253/100000) average pos_loss: 0.3424338698387146\n",
      "[proc 0][Train](253/100000) average neg_loss: 0.3590005040168762\n",
      "[proc 0][Train](253/100000) average loss: 0.3507171869277954\n",
      "[proc 0][Train](253/100000) average regularization: 0.015011247247457504\n",
      "[proc 0][Train] 1 steps take 1.309 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.089\n",
      "[proc 1][Train](252/100000) average pos_loss: 0.33466288447380066\n",
      "[proc 1][Train](252/100000) average neg_loss: 0.636135458946228\n",
      "[proc 1][Train](252/100000) average loss: 0.48539918661117554\n",
      "[proc 1][Train](252/100000) average regularization: 0.01519746147096157\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.086\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](254/100000) average pos_loss: 0.3071923851966858\n",
      "[proc 0][Train](254/100000) average neg_loss: 0.64238440990448\n",
      "[proc 0][Train](254/100000) average loss: 0.4747883975505829\n",
      "[proc 0][Train](254/100000) average regularization: 0.015494081191718578\n",
      "[proc 0][Train] 1 steps take 1.338 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.121\n",
      "[proc 1][Train](253/100000) average pos_loss: 0.3173582851886749\n",
      "[proc 1][Train](253/100000) average neg_loss: 0.3529200553894043\n",
      "[proc 1][Train](253/100000) average loss: 0.3351391553878784\n",
      "[proc 1][Train](253/100000) average regularization: 0.01532350666821003\n",
      "[proc 1][Train] 1 steps take 1.337 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.121\n",
      "[proc 0][Train](255/100000) average pos_loss: 0.3479519486427307\n",
      "[proc 0][Train](255/100000) average neg_loss: 0.3749035596847534\n",
      "[proc 0][Train](255/100000) average loss: 0.36142775416374207\n",
      "[proc 0][Train](255/100000) average regularization: 0.015135291963815689\n",
      "[proc 0][Train] 1 steps take 1.288 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.206, backward: 0.002, update: 1.079\n",
      "[proc 1][Train](254/100000) average pos_loss: 0.31857287883758545\n",
      "[proc 1][Train](254/100000) average neg_loss: 0.6562121510505676\n",
      "[proc 1][Train](254/100000) average loss: 0.48739251494407654\n",
      "[proc 1][Train](254/100000) average regularization: 0.01529289223253727\n",
      "[proc 1][Train] 1 steps take 1.356 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.136\n",
      "[proc 0][Train](256/100000) average pos_loss: 0.3027290105819702\n",
      "[proc 0][Train](256/100000) average neg_loss: 0.7047042846679688\n",
      "[proc 0][Train](256/100000) average loss: 0.5037166476249695\n",
      "[proc 0][Train](256/100000) average regularization: 0.015498216263949871\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.201, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](255/100000) average pos_loss: 0.3248956501483917\n",
      "[proc 1][Train](255/100000) average neg_loss: 0.3380446434020996\n",
      "[proc 1][Train](255/100000) average loss: 0.3314701318740845\n",
      "[proc 1][Train](255/100000) average regularization: 0.015283994376659393\n",
      "[proc 1][Train] 1 steps take 1.358 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.144\n",
      "[proc 0][Train](257/100000) average pos_loss: 0.35908472537994385\n",
      "[proc 0][Train](257/100000) average neg_loss: 0.3075697720050812\n",
      "[proc 0][Train](257/100000) average loss: 0.3333272337913513\n",
      "[proc 0][Train](257/100000) average regularization: 0.015173806808888912\n",
      "[proc 0][Train] 1 steps take 1.331 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.197, backward: 0.003, update: 1.114\n",
      "[proc 1][Train](256/100000) average pos_loss: 0.33039283752441406\n",
      "[proc 1][Train](256/100000) average neg_loss: 0.6081803441047668\n",
      "[proc 1][Train](256/100000) average loss: 0.46928659081459045\n",
      "[proc 1][Train](256/100000) average regularization: 0.015366800129413605\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.098\n",
      "[proc 0][Train](258/100000) average pos_loss: 0.3007965683937073\n",
      "[proc 0][Train](258/100000) average neg_loss: 0.7444930076599121\n",
      "[proc 0][Train](258/100000) average loss: 0.5226447582244873\n",
      "[proc 0][Train](258/100000) average regularization: 0.015630310401320457\n",
      "[proc 0][Train] 1 steps take 1.321 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.210, backward: 0.002, update: 1.092\n",
      "[proc 1][Train](257/100000) average pos_loss: 0.3105013072490692\n",
      "[proc 1][Train](257/100000) average neg_loss: 0.3971903920173645\n",
      "[proc 1][Train](257/100000) average loss: 0.35384583473205566\n",
      "[proc 1][Train](257/100000) average regularization: 0.015410001389682293\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.214, backward: 0.003, update: 1.077\n",
      "[proc 0][Train](259/100000) average pos_loss: 0.3499423861503601\n",
      "[proc 0][Train](259/100000) average neg_loss: 0.3487914800643921\n",
      "[proc 0][Train](259/100000) average loss: 0.3493669331073761\n",
      "[proc 0][Train](259/100000) average regularization: 0.01508972980082035\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.083\n",
      "[proc 1][Train](258/100000) average pos_loss: 0.3372465968132019\n",
      "[proc 1][Train](258/100000) average neg_loss: 0.6180749535560608\n",
      "[proc 1][Train](258/100000) average loss: 0.47766077518463135\n",
      "[proc 1][Train](258/100000) average regularization: 0.015150192193686962\n",
      "[proc 1][Train] 1 steps take 1.268 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.206, backward: 0.003, update: 1.043\n",
      "[proc 0][Train](260/100000) average pos_loss: 0.3140343129634857\n",
      "[proc 0][Train](260/100000) average neg_loss: 0.6866897344589233\n",
      "[proc 0][Train](260/100000) average loss: 0.5003620386123657\n",
      "[proc 0][Train](260/100000) average regularization: 0.015395676717162132\n",
      "[proc 0][Train] 1 steps take 1.315 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](259/100000) average pos_loss: 0.32287806272506714\n",
      "[proc 1][Train](259/100000) average neg_loss: 0.3541436195373535\n",
      "[proc 1][Train](259/100000) average loss: 0.3385108411312103\n",
      "[proc 1][Train](259/100000) average regularization: 0.015213130973279476\n",
      "[proc 1][Train] 1 steps take 1.309 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.204, backward: 0.003, update: 1.101\n",
      "[proc 0][Train](261/100000) average pos_loss: 0.35027211904525757\n",
      "[proc 0][Train](261/100000) average neg_loss: 0.3652918338775635\n",
      "[proc 0][Train](261/100000) average loss: 0.3577819764614105\n",
      "[proc 0][Train](261/100000) average regularization: 0.015162929892539978\n",
      "[proc 0][Train] 1 steps take 1.319 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.002, update: 1.106\n",
      "[proc 1][Train](260/100000) average pos_loss: 0.32942643761634827\n",
      "[proc 1][Train](260/100000) average neg_loss: 0.6489945650100708\n",
      "[proc 1][Train](260/100000) average loss: 0.48921048641204834\n",
      "[proc 1][Train](260/100000) average regularization: 0.015263051725924015\n",
      "[proc 1][Train] 1 steps take 1.326 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.107\n",
      "[proc 0][Train](262/100000) average pos_loss: 0.3110629618167877\n",
      "[proc 0][Train](262/100000) average neg_loss: 0.6747862100601196\n",
      "[proc 0][Train](262/100000) average loss: 0.4929245710372925\n",
      "[proc 0][Train](262/100000) average regularization: 0.015407983213663101\n",
      "[proc 0][Train] 1 steps take 1.343 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.222, backward: 0.003, update: 1.117\n",
      "[proc 1][Train](261/100000) average pos_loss: 0.3266211152076721\n",
      "[proc 1][Train](261/100000) average neg_loss: 0.3502010107040405\n",
      "[proc 1][Train](261/100000) average loss: 0.3384110629558563\n",
      "[proc 1][Train](261/100000) average regularization: 0.015219295397400856\n",
      "[proc 1][Train] 1 steps take 1.302 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.216, backward: 0.002, update: 1.083\n",
      "[proc 0][Train](263/100000) average pos_loss: 0.3572177588939667\n",
      "[proc 0][Train](263/100000) average neg_loss: 0.32696279883384705\n",
      "[proc 0][Train](263/100000) average loss: 0.34209027886390686\n",
      "[proc 0][Train](263/100000) average regularization: 0.014938417822122574\n",
      "[proc 0][Train] 1 steps take 1.308 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.088\n",
      "[proc 1][Train](262/100000) average pos_loss: 0.32608622312545776\n",
      "[proc 1][Train](262/100000) average neg_loss: 0.650445818901062\n",
      "[proc 1][Train](262/100000) average loss: 0.4882660210132599\n",
      "[proc 1][Train](262/100000) average regularization: 0.015193164348602295\n",
      "[proc 1][Train] 1 steps take 1.326 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.002, update: 1.106\n",
      "[proc 0][Train](264/100000) average pos_loss: 0.30038630962371826\n",
      "[proc 0][Train](264/100000) average neg_loss: 0.7706828117370605\n",
      "[proc 0][Train](264/100000) average loss: 0.5355345606803894\n",
      "[proc 0][Train](264/100000) average regularization: 0.015513269230723381\n",
      "[proc 0][Train] 1 steps take 1.311 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.091\n",
      "[proc 1][Train](263/100000) average pos_loss: 0.31719526648521423\n",
      "[proc 1][Train](263/100000) average neg_loss: 0.3447667360305786\n",
      "[proc 1][Train](263/100000) average loss: 0.3309810161590576\n",
      "[proc 1][Train](263/100000) average regularization: 0.015285369008779526\n",
      "[proc 1][Train] 1 steps take 1.305 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.087\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](265/100000) average pos_loss: 0.3684707283973694\n",
      "[proc 0][Train](265/100000) average neg_loss: 0.3297612965106964\n",
      "[proc 0][Train](265/100000) average loss: 0.3491160273551941\n",
      "[proc 0][Train](265/100000) average regularization: 0.01499052532017231\n",
      "[proc 0][Train] 1 steps take 1.317 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.004, update: 1.095\n",
      "[proc 1][Train](264/100000) average pos_loss: 0.3437402844429016\n",
      "[proc 1][Train](264/100000) average neg_loss: 0.5964688062667847\n",
      "[proc 1][Train](264/100000) average loss: 0.47010454535484314\n",
      "[proc 1][Train](264/100000) average regularization: 0.015148703008890152\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.086\n",
      "[proc 0][Train](266/100000) average pos_loss: 0.3090328872203827\n",
      "[proc 0][Train](266/100000) average neg_loss: 0.6944539546966553\n",
      "[proc 0][Train](266/100000) average loss: 0.5017434358596802\n",
      "[proc 0][Train](266/100000) average regularization: 0.01531126443296671\n",
      "[proc 0][Train] 1 steps take 1.313 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.220, backward: 0.005, update: 1.086\n",
      "[proc 1][Train](265/100000) average pos_loss: 0.31656014919281006\n",
      "[proc 1][Train](265/100000) average neg_loss: 0.3618542551994324\n",
      "[proc 1][Train](265/100000) average loss: 0.3392072021961212\n",
      "[proc 1][Train](265/100000) average regularization: 0.015163672156631947\n",
      "[proc 1][Train] 1 steps take 1.300 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.080\n",
      "[proc 0][Train](267/100000) average pos_loss: 0.3461737036705017\n",
      "[proc 0][Train](267/100000) average neg_loss: 0.3317839503288269\n",
      "[proc 0][Train](267/100000) average loss: 0.3389788269996643\n",
      "[proc 0][Train](267/100000) average regularization: 0.015210093930363655\n",
      "[proc 0][Train] 1 steps take 1.286 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.198, backward: 0.004, update: 1.083\n",
      "[proc 1][Train](266/100000) average pos_loss: 0.3251752257347107\n",
      "[proc 1][Train](266/100000) average neg_loss: 0.6390042304992676\n",
      "[proc 1][Train](266/100000) average loss: 0.48208972811698914\n",
      "[proc 1][Train](266/100000) average regularization: 0.015244806185364723\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.090\n",
      "[proc 0][Train](268/100000) average pos_loss: 0.29679635167121887\n",
      "[proc 0][Train](268/100000) average neg_loss: 0.7565788626670837\n",
      "[proc 0][Train](268/100000) average loss: 0.5266876220703125\n",
      "[proc 0][Train](268/100000) average regularization: 0.015496326610445976\n",
      "[proc 0][Train] 1 steps take 1.311 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.004, update: 1.091\n",
      "[proc 1][Train](267/100000) average pos_loss: 0.3227088451385498\n",
      "[proc 1][Train](267/100000) average neg_loss: 0.38567763566970825\n",
      "[proc 1][Train](267/100000) average loss: 0.35419324040412903\n",
      "[proc 1][Train](267/100000) average regularization: 0.01524923462420702\n",
      "[proc 1][Train] 1 steps take 1.315 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.096\n",
      "[proc 0][Train](269/100000) average pos_loss: 0.36085039377212524\n",
      "[proc 0][Train](269/100000) average neg_loss: 0.31675493717193604\n",
      "[proc 0][Train](269/100000) average loss: 0.33880266547203064\n",
      "[proc 0][Train](269/100000) average regularization: 0.014895503409206867\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.212, backward: 0.004, update: 1.084\n",
      "[proc 1][Train](268/100000) average pos_loss: 0.3409107029438019\n",
      "[proc 1][Train](268/100000) average neg_loss: 0.6279346942901611\n",
      "[proc 1][Train](268/100000) average loss: 0.4844226837158203\n",
      "[proc 1][Train](268/100000) average regularization: 0.015133448876440525\n",
      "[proc 1][Train] 1 steps take 1.283 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.067\n",
      "[proc 0][Train](270/100000) average pos_loss: 0.3053615689277649\n",
      "[proc 0][Train](270/100000) average neg_loss: 0.6966812610626221\n",
      "[proc 0][Train](270/100000) average loss: 0.5010213851928711\n",
      "[proc 0][Train](270/100000) average regularization: 0.015520207583904266\n",
      "[proc 0][Train] 1 steps take 1.294 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.078\n",
      "[proc 1][Train](269/100000) average pos_loss: 0.3200433850288391\n",
      "[proc 1][Train](269/100000) average neg_loss: 0.3438469171524048\n",
      "[proc 1][Train](269/100000) average loss: 0.33194515109062195\n",
      "[proc 1][Train](269/100000) average regularization: 0.015176008455455303\n",
      "[proc 1][Train] 1 steps take 1.292 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.208, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](271/100000) average pos_loss: 0.3489118218421936\n",
      "[proc 0][Train](271/100000) average neg_loss: 0.35580289363861084\n",
      "[proc 0][Train](271/100000) average loss: 0.3523573577404022\n",
      "[proc 0][Train](271/100000) average regularization: 0.014932279475033283\n",
      "[proc 0][Train] 1 steps take 1.283 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.197, backward: 0.004, update: 1.081\n",
      "[proc 1][Train](270/100000) average pos_loss: 0.3247794508934021\n",
      "[proc 1][Train](270/100000) average neg_loss: 0.6523898839950562\n",
      "[proc 1][Train](270/100000) average loss: 0.4885846674442291\n",
      "[proc 1][Train](270/100000) average regularization: 0.015181178227066994\n",
      "[proc 1][Train] 1 steps take 1.305 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.086\n",
      "[proc 0][Train](272/100000) average pos_loss: 0.30184221267700195\n",
      "[proc 0][Train](272/100000) average neg_loss: 0.6948412656784058\n",
      "[proc 0][Train](272/100000) average loss: 0.49834173917770386\n",
      "[proc 0][Train](272/100000) average regularization: 0.015378192067146301\n",
      "[proc 0][Train] 1 steps take 1.331 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.112\n",
      "[proc 1][Train](271/100000) average pos_loss: 0.3275233209133148\n",
      "[proc 1][Train](271/100000) average neg_loss: 0.3215057849884033\n",
      "[proc 1][Train](271/100000) average loss: 0.32451456785202026\n",
      "[proc 1][Train](271/100000) average regularization: 0.01528643723577261\n",
      "[proc 1][Train] 1 steps take 1.368 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.151\n",
      "[proc 0][Train](273/100000) average pos_loss: 0.3609663248062134\n",
      "[proc 0][Train](273/100000) average neg_loss: 0.3101395070552826\n",
      "[proc 0][Train](273/100000) average loss: 0.3355529308319092\n",
      "[proc 0][Train](273/100000) average regularization: 0.015102384611964226\n",
      "[proc 0][Train] 1 steps take 1.340 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.209, backward: 0.004, update: 1.109\n",
      "[proc 1][Train](272/100000) average pos_loss: 0.33193308115005493\n",
      "[proc 1][Train](272/100000) average neg_loss: 0.6088317036628723\n",
      "[proc 1][Train](272/100000) average loss: 0.4703823924064636\n",
      "[proc 1][Train](272/100000) average regularization: 0.015078301541507244\n",
      "[proc 1][Train] 1 steps take 1.308 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.090\n",
      "[proc 0][Train](274/100000) average pos_loss: 0.2990414500236511\n",
      "[proc 0][Train](274/100000) average neg_loss: 0.7067291736602783\n",
      "[proc 0][Train](274/100000) average loss: 0.5028853416442871\n",
      "[proc 0][Train](274/100000) average regularization: 0.015397765673696995\n",
      "[proc 0][Train] 1 steps take 1.320 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.211, backward: 0.003, update: 1.090\n",
      "[proc 1][Train](273/100000) average pos_loss: 0.3126218914985657\n",
      "[proc 1][Train](273/100000) average neg_loss: 0.33782702684402466\n",
      "[proc 1][Train](273/100000) average loss: 0.32522445917129517\n",
      "[proc 1][Train](273/100000) average regularization: 0.015421398915350437\n",
      "[proc 1][Train] 1 steps take 1.312 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.214, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](275/100000) average pos_loss: 0.345781147480011\n",
      "[proc 0][Train](275/100000) average neg_loss: 0.34913378953933716\n",
      "[proc 0][Train](275/100000) average loss: 0.3474574685096741\n",
      "[proc 0][Train](275/100000) average regularization: 0.015151992440223694\n",
      "[proc 0][Train] 1 steps take 1.294 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.078\n",
      "[proc 1][Train](274/100000) average pos_loss: 0.3287159502506256\n",
      "[proc 1][Train](274/100000) average neg_loss: 0.6879144310951233\n",
      "[proc 1][Train](274/100000) average loss: 0.5083152055740356\n",
      "[proc 1][Train](274/100000) average regularization: 0.015466430224478245\n",
      "[proc 1][Train] 1 steps take 1.316 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.211, backward: 0.002, update: 1.089\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](276/100000) average pos_loss: 0.3000584840774536\n",
      "[proc 0][Train](276/100000) average neg_loss: 0.6936894059181213\n",
      "[proc 0][Train](276/100000) average loss: 0.4968739449977875\n",
      "[proc 0][Train](276/100000) average regularization: 0.015617508441209793\n",
      "[proc 0][Train] 1 steps take 1.320 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.104\n",
      "[proc 1][Train](275/100000) average pos_loss: 0.31911981105804443\n",
      "[proc 1][Train](275/100000) average neg_loss: 0.3642953634262085\n",
      "[proc 1][Train](275/100000) average loss: 0.34170758724212646\n",
      "[proc 1][Train](275/100000) average regularization: 0.015256596729159355\n",
      "[proc 1][Train] 1 steps take 1.284 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.066\n",
      "[proc 0][Train](277/100000) average pos_loss: 0.35385942459106445\n",
      "[proc 0][Train](277/100000) average neg_loss: 0.3390606641769409\n",
      "[proc 0][Train](277/100000) average loss: 0.3464600443840027\n",
      "[proc 0][Train](277/100000) average regularization: 0.015110031701624393\n",
      "[proc 0][Train] 1 steps take 1.288 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.210, backward: 0.003, update: 1.073\n",
      "[proc 1][Train](276/100000) average pos_loss: 0.3330036401748657\n",
      "[proc 1][Train](276/100000) average neg_loss: 0.5713247656822205\n",
      "[proc 1][Train](276/100000) average loss: 0.4521642029285431\n",
      "[proc 1][Train](276/100000) average regularization: 0.015224834904074669\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](278/100000) average pos_loss: 0.31684112548828125\n",
      "[proc 0][Train](278/100000) average neg_loss: 0.6923967599868774\n",
      "[proc 0][Train](278/100000) average loss: 0.5046189427375793\n",
      "[proc 0][Train](278/100000) average regularization: 0.015269165858626366\n",
      "[proc 0][Train] 1 steps take 1.296 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.080\n",
      "[proc 1][Train](277/100000) average pos_loss: 0.3168637752532959\n",
      "[proc 1][Train](277/100000) average neg_loss: 0.3230435252189636\n",
      "[proc 1][Train](277/100000) average loss: 0.31995365023612976\n",
      "[proc 1][Train](277/100000) average regularization: 0.01536854263395071\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.216, backward: 0.003, update: 1.083\n",
      "[proc 0][Train](279/100000) average pos_loss: 0.348180890083313\n",
      "[proc 0][Train](279/100000) average neg_loss: 0.3250105381011963\n",
      "[proc 0][Train](279/100000) average loss: 0.33659571409225464\n",
      "[proc 0][Train](279/100000) average regularization: 0.015181293711066246\n",
      "[proc 0][Train] 1 steps take 1.366 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.149\n",
      "[proc 1][Train](278/100000) average pos_loss: 0.3291434049606323\n",
      "[proc 1][Train](278/100000) average neg_loss: 0.6855130195617676\n",
      "[proc 1][Train](278/100000) average loss: 0.5073282122612\n",
      "[proc 1][Train](278/100000) average regularization: 0.015224598348140717\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.089\n",
      "[proc 0][Train](280/100000) average pos_loss: 0.29519543051719666\n",
      "[proc 0][Train](280/100000) average neg_loss: 0.7287115454673767\n",
      "[proc 0][Train](280/100000) average loss: 0.5119534730911255\n",
      "[proc 0][Train](280/100000) average regularization: 0.015604635700583458\n",
      "[proc 0][Train] 1 steps take 1.346 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.126\n",
      "[proc 1][Train](279/100000) average pos_loss: 0.31859034299850464\n",
      "[proc 1][Train](279/100000) average neg_loss: 0.3810279071331024\n",
      "[proc 1][Train](279/100000) average loss: 0.34980911016464233\n",
      "[proc 1][Train](279/100000) average regularization: 0.015199536457657814\n",
      "[proc 1][Train] 1 steps take 1.314 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.098\n",
      "[proc 0][Train](281/100000) average pos_loss: 0.36516067385673523\n",
      "[proc 0][Train](281/100000) average neg_loss: 0.33128687739372253\n",
      "[proc 0][Train](281/100000) average loss: 0.3482237756252289\n",
      "[proc 0][Train](281/100000) average regularization: 0.014963013119995594\n",
      "[proc 0][Train] 1 steps take 1.335 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.119\n",
      "[proc 1][Train](280/100000) average pos_loss: 0.3338368535041809\n",
      "[proc 1][Train](280/100000) average neg_loss: 0.624030590057373\n",
      "[proc 1][Train](280/100000) average loss: 0.478933721780777\n",
      "[proc 1][Train](280/100000) average regularization: 0.01511877030134201\n",
      "[proc 1][Train] 1 steps take 1.306 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.089\n",
      "[proc 0][Train](282/100000) average pos_loss: 0.30730682611465454\n",
      "[proc 0][Train](282/100000) average neg_loss: 0.6567826867103577\n",
      "[proc 0][Train](282/100000) average loss: 0.4820447564125061\n",
      "[proc 0][Train](282/100000) average regularization: 0.01531737670302391\n",
      "[proc 0][Train] 1 steps take 1.278 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.057\n",
      "[proc 1][Train](281/100000) average pos_loss: 0.32394516468048096\n",
      "[proc 1][Train](281/100000) average neg_loss: 0.34576496481895447\n",
      "[proc 1][Train](281/100000) average loss: 0.3348550796508789\n",
      "[proc 1][Train](281/100000) average regularization: 0.01538417860865593\n",
      "[proc 1][Train] 1 steps take 1.283 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.200, backward: 0.001, update: 1.080\n",
      "[proc 0][Train](283/100000) average pos_loss: 0.3482518792152405\n",
      "[proc 0][Train](283/100000) average neg_loss: 0.3560757040977478\n",
      "[proc 0][Train](283/100000) average loss: 0.35216379165649414\n",
      "[proc 0][Train](283/100000) average regularization: 0.015043199993669987\n",
      "[proc 0][Train] 1 steps take 1.283 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.065\n",
      "[proc 1][Train](282/100000) average pos_loss: 0.3205324411392212\n",
      "[proc 1][Train](282/100000) average neg_loss: 0.6821472644805908\n",
      "[proc 1][Train](282/100000) average loss: 0.501339852809906\n",
      "[proc 1][Train](282/100000) average regularization: 0.015400285832583904\n",
      "[proc 1][Train] 1 steps take 1.367 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.280, backward: 0.003, update: 1.082\n",
      "[proc 0][Train](284/100000) average pos_loss: 0.3022099435329437\n",
      "[proc 0][Train](284/100000) average neg_loss: 0.7036652565002441\n",
      "[proc 0][Train](284/100000) average loss: 0.5029376149177551\n",
      "[proc 0][Train](284/100000) average regularization: 0.015450315549969673\n",
      "[proc 0][Train] 1 steps take 1.280 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.200, backward: 0.003, update: 1.076\n",
      "[proc 1][Train](283/100000) average pos_loss: 0.326718270778656\n",
      "[proc 1][Train](283/100000) average neg_loss: 0.3450155258178711\n",
      "[proc 1][Train](283/100000) average loss: 0.33586689829826355\n",
      "[proc 1][Train](283/100000) average regularization: 0.015124307945370674\n",
      "[proc 1][Train] 1 steps take 1.345 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.235, backward: 0.003, update: 1.106\n",
      "[proc 0][Train](285/100000) average pos_loss: 0.35440242290496826\n",
      "[proc 0][Train](285/100000) average neg_loss: 0.34572839736938477\n",
      "[proc 0][Train](285/100000) average loss: 0.3500654101371765\n",
      "[proc 0][Train](285/100000) average regularization: 0.015049424953758717\n",
      "[proc 0][Train] 1 steps take 1.305 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.089\n",
      "[proc 1][Train](284/100000) average pos_loss: 0.33157840371131897\n",
      "[proc 1][Train](284/100000) average neg_loss: 0.6497460603713989\n",
      "[proc 1][Train](284/100000) average loss: 0.49066221714019775\n",
      "[proc 1][Train](284/100000) average regularization: 0.01519195456057787\n",
      "[proc 1][Train] 1 steps take 1.266 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.056\n",
      "[proc 0][Train](286/100000) average pos_loss: 0.3071140646934509\n",
      "[proc 0][Train](286/100000) average neg_loss: 0.6984391808509827\n",
      "[proc 0][Train](286/100000) average loss: 0.5027766227722168\n",
      "[proc 0][Train](286/100000) average regularization: 0.015350131317973137\n",
      "[proc 0][Train] 1 steps take 1.287 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.070\n",
      "[proc 1][Train](285/100000) average pos_loss: 0.3302353620529175\n",
      "[proc 1][Train](285/100000) average neg_loss: 0.36459439992904663\n",
      "[proc 1][Train](285/100000) average loss: 0.34741488099098206\n",
      "[proc 1][Train](285/100000) average regularization: 0.015098228119313717\n",
      "[proc 1][Train] 1 steps take 1.315 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.098\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](287/100000) average pos_loss: 0.35897165536880493\n",
      "[proc 0][Train](287/100000) average neg_loss: 0.3334609866142273\n",
      "[proc 0][Train](287/100000) average loss: 0.3462163209915161\n",
      "[proc 0][Train](287/100000) average regularization: 0.01503517385572195\n",
      "[proc 0][Train] 1 steps take 1.322 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.107\n",
      "[proc 1][Train](286/100000) average pos_loss: 0.3282528519630432\n",
      "[proc 1][Train](286/100000) average neg_loss: 0.6216282844543457\n",
      "[proc 1][Train](286/100000) average loss: 0.47494056820869446\n",
      "[proc 1][Train](286/100000) average regularization: 0.0150356600061059\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.208, backward: 0.002, update: 1.096\n",
      "[proc 0][Train](288/100000) average pos_loss: 0.308067262172699\n",
      "[proc 0][Train](288/100000) average neg_loss: 0.6844441890716553\n",
      "[proc 0][Train](288/100000) average loss: 0.4962557256221771\n",
      "[proc 0][Train](288/100000) average regularization: 0.015303071588277817\n",
      "[proc 0][Train] 1 steps take 1.341 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.219, backward: 0.003, update: 1.118\n",
      "[proc 1][Train](287/100000) average pos_loss: 0.32247358560562134\n",
      "[proc 1][Train](287/100000) average neg_loss: 0.3265686631202698\n",
      "[proc 1][Train](287/100000) average loss: 0.32452112436294556\n",
      "[proc 1][Train](287/100000) average regularization: 0.015122050419449806\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.094\n",
      "[proc 0][Train](289/100000) average pos_loss: 0.35360872745513916\n",
      "[proc 0][Train](289/100000) average neg_loss: 0.3512348234653473\n",
      "[proc 0][Train](289/100000) average loss: 0.35242176055908203\n",
      "[proc 0][Train](289/100000) average regularization: 0.015097911469638348\n",
      "[proc 0][Train] 1 steps take 1.315 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.212, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](288/100000) average pos_loss: 0.32202333211898804\n",
      "[proc 1][Train](288/100000) average neg_loss: 0.6252521276473999\n",
      "[proc 1][Train](288/100000) average loss: 0.47363772988319397\n",
      "[proc 1][Train](288/100000) average regularization: 0.015279739163815975\n",
      "[proc 1][Train] 1 steps take 1.327 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.108\n",
      "[proc 0][Train](290/100000) average pos_loss: 0.305686354637146\n",
      "[proc 0][Train](290/100000) average neg_loss: 0.7152959704399109\n",
      "[proc 0][Train](290/100000) average loss: 0.510491132736206\n",
      "[proc 0][Train](290/100000) average regularization: 0.015420710667967796\n",
      "[proc 0][Train] 1 steps take 1.299 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.204, backward: 0.003, update: 1.076\n",
      "[proc 1][Train](289/100000) average pos_loss: 0.31857961416244507\n",
      "[proc 1][Train](289/100000) average neg_loss: 0.35811948776245117\n",
      "[proc 1][Train](289/100000) average loss: 0.3383495509624481\n",
      "[proc 1][Train](289/100000) average regularization: 0.015323885716497898\n",
      "[proc 1][Train] 1 steps take 1.326 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.214, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](291/100000) average pos_loss: 0.3498278558254242\n",
      "[proc 0][Train](291/100000) average neg_loss: 0.34141451120376587\n",
      "[proc 0][Train](291/100000) average loss: 0.34562116861343384\n",
      "[proc 0][Train](291/100000) average regularization: 0.014982080087065697\n",
      "[proc 0][Train] 1 steps take 1.297 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.197, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](290/100000) average pos_loss: 0.3266299366950989\n",
      "[proc 1][Train](290/100000) average neg_loss: 0.6319854855537415\n",
      "[proc 1][Train](290/100000) average loss: 0.47930771112442017\n",
      "[proc 1][Train](290/100000) average regularization: 0.015230856835842133\n",
      "[proc 1][Train] 1 steps take 1.325 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.211, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](292/100000) average pos_loss: 0.31205976009368896\n",
      "[proc 0][Train](292/100000) average neg_loss: 0.6659303307533264\n",
      "[proc 0][Train](292/100000) average loss: 0.4889950454235077\n",
      "[proc 0][Train](292/100000) average regularization: 0.01525061298161745\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.002, update: 1.082\n",
      "[proc 1][Train](291/100000) average pos_loss: 0.3236043155193329\n",
      "[proc 1][Train](291/100000) average neg_loss: 0.355066180229187\n",
      "[proc 1][Train](291/100000) average loss: 0.33933526277542114\n",
      "[proc 1][Train](291/100000) average regularization: 0.015167990699410439\n",
      "[proc 1][Train] 1 steps take 1.297 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.208, backward: 0.002, update: 1.086\n",
      "[proc 0][Train](293/100000) average pos_loss: 0.3464059829711914\n",
      "[proc 0][Train](293/100000) average neg_loss: 0.33988288044929504\n",
      "[proc 0][Train](293/100000) average loss: 0.34314441680908203\n",
      "[proc 0][Train](293/100000) average regularization: 0.015226474963128567\n",
      "[proc 0][Train] 1 steps take 1.270 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.196, backward: 0.002, update: 1.070\n",
      "[proc 1][Train](292/100000) average pos_loss: 0.33184993267059326\n",
      "[proc 1][Train](292/100000) average neg_loss: 0.6356245279312134\n",
      "[proc 1][Train](292/100000) average loss: 0.4837372303009033\n",
      "[proc 1][Train](292/100000) average regularization: 0.015371819026768208\n",
      "[proc 1][Train] 1 steps take 1.269 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.053\n",
      "[proc 0][Train](294/100000) average pos_loss: 0.3099905848503113\n",
      "[proc 0][Train](294/100000) average neg_loss: 0.6847591400146484\n",
      "[proc 0][Train](294/100000) average loss: 0.49737486243247986\n",
      "[proc 0][Train](294/100000) average regularization: 0.01540560182183981\n",
      "[proc 0][Train] 1 steps take 1.285 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.197, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](293/100000) average pos_loss: 0.32314619421958923\n",
      "[proc 1][Train](293/100000) average neg_loss: 0.34300634264945984\n",
      "[proc 1][Train](293/100000) average loss: 0.33307626843452454\n",
      "[proc 1][Train](293/100000) average regularization: 0.015361389145255089\n",
      "[proc 1][Train] 1 steps take 1.270 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.056\n",
      "[proc 0][Train](295/100000) average pos_loss: 0.35445892810821533\n",
      "[proc 0][Train](295/100000) average neg_loss: 0.30145514011383057\n",
      "[proc 0][Train](295/100000) average loss: 0.32795703411102295\n",
      "[proc 0][Train](295/100000) average regularization: 0.015198268927633762\n",
      "[proc 0][Train] 1 steps take 1.290 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.075\n",
      "[proc 1][Train](294/100000) average pos_loss: 0.330822229385376\n",
      "[proc 1][Train](294/100000) average neg_loss: 0.7283629775047302\n",
      "[proc 1][Train](294/100000) average loss: 0.5295926332473755\n",
      "[proc 1][Train](294/100000) average regularization: 0.015160092152655125\n",
      "[proc 1][Train] 1 steps take 1.298 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.084\n",
      "[proc 0][Train](296/100000) average pos_loss: 0.2971062660217285\n",
      "[proc 0][Train](296/100000) average neg_loss: 0.7584857940673828\n",
      "[proc 0][Train](296/100000) average loss: 0.5277960300445557\n",
      "[proc 0][Train](296/100000) average regularization: 0.015515195205807686\n",
      "[proc 0][Train] 1 steps take 1.294 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.077\n",
      "[proc 1][Train](295/100000) average pos_loss: 0.33446410298347473\n",
      "[proc 1][Train](295/100000) average neg_loss: 0.36703580617904663\n",
      "[proc 1][Train](295/100000) average loss: 0.3507499694824219\n",
      "[proc 1][Train](295/100000) average regularization: 0.015126512385904789\n",
      "[proc 1][Train] 1 steps take 1.297 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.079\n",
      "[proc 0][Train](297/100000) average pos_loss: 0.3626713752746582\n",
      "[proc 0][Train](297/100000) average neg_loss: 0.33108285069465637\n",
      "[proc 0][Train](297/100000) average loss: 0.3468770980834961\n",
      "[proc 0][Train](297/100000) average regularization: 0.014924504794180393\n",
      "[proc 0][Train] 1 steps take 1.285 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.199, backward: 0.002, update: 1.083\n",
      "[proc 1][Train](296/100000) average pos_loss: 0.3469561040401459\n",
      "[proc 1][Train](296/100000) average neg_loss: 0.59422367811203\n",
      "[proc 1][Train](296/100000) average loss: 0.47058987617492676\n",
      "[proc 1][Train](296/100000) average regularization: 0.015072901733219624\n",
      "[proc 1][Train] 1 steps take 1.296 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.080\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](298/100000) average pos_loss: 0.3119547665119171\n",
      "[proc 0][Train](298/100000) average neg_loss: 0.6682519912719727\n",
      "[proc 0][Train](298/100000) average loss: 0.4901033639907837\n",
      "[proc 0][Train](298/100000) average regularization: 0.01528087630867958\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.091\n",
      "[proc 1][Train](297/100000) average pos_loss: 0.3260771334171295\n",
      "[proc 1][Train](297/100000) average neg_loss: 0.35610729455947876\n",
      "[proc 1][Train](297/100000) average loss: 0.34109222888946533\n",
      "[proc 1][Train](297/100000) average regularization: 0.01522324699908495\n",
      "[proc 1][Train] 1 steps take 1.287 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.071\n",
      "[proc 0][Train](299/100000) average pos_loss: 0.3571634292602539\n",
      "[proc 0][Train](299/100000) average neg_loss: 0.3375365734100342\n",
      "[proc 0][Train](299/100000) average loss: 0.34735000133514404\n",
      "[proc 0][Train](299/100000) average regularization: 0.014904857613146305\n",
      "[proc 0][Train] 1 steps take 1.318 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.101\n",
      "[proc 1][Train](298/100000) average pos_loss: 0.3315618932247162\n",
      "[proc 1][Train](298/100000) average neg_loss: 0.6193962097167969\n",
      "[proc 1][Train](298/100000) average loss: 0.4754790663719177\n",
      "[proc 1][Train](298/100000) average regularization: 0.015112250111997128\n",
      "[proc 1][Train] 1 steps take 1.313 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](300/100000) average pos_loss: 0.30144011974334717\n",
      "[proc 0][Train](300/100000) average neg_loss: 0.7073191404342651\n",
      "[proc 0][Train](300/100000) average loss: 0.5043796300888062\n",
      "[proc 0][Train](300/100000) average regularization: 0.015359564684331417\n",
      "[proc 0][Train] 1 steps take 1.298 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.079\n",
      "[proc 1][Train](299/100000) average pos_loss: 0.3251880407333374\n",
      "[proc 1][Train](299/100000) average neg_loss: 0.34123530983924866\n",
      "[proc 1][Train](299/100000) average loss: 0.33321166038513184\n",
      "[proc 1][Train](299/100000) average regularization: 0.015351387672126293\n",
      "[proc 1][Train] 1 steps take 1.299 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.078\n",
      "[proc 0][Train](301/100000) average pos_loss: 0.35813039541244507\n",
      "[proc 0][Train](301/100000) average neg_loss: 0.3212183117866516\n",
      "[proc 0][Train](301/100000) average loss: 0.33967435359954834\n",
      "[proc 0][Train](301/100000) average regularization: 0.015135890804231167\n",
      "[proc 0][Train] 1 steps take 1.291 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.077\n",
      "[proc 1][Train](300/100000) average pos_loss: 0.3355943560600281\n",
      "[proc 1][Train](300/100000) average neg_loss: 0.6102296710014343\n",
      "[proc 1][Train](300/100000) average loss: 0.4729120135307312\n",
      "[proc 1][Train](300/100000) average regularization: 0.015217877924442291\n",
      "[proc 1][Train] 1 steps take 1.301 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.002, update: 1.091\n",
      "[proc 0][Train](302/100000) average pos_loss: 0.30547910928726196\n",
      "[proc 0][Train](302/100000) average neg_loss: 0.6956757307052612\n",
      "[proc 0][Train](302/100000) average loss: 0.500577449798584\n",
      "[proc 0][Train](302/100000) average regularization: 0.015245414339005947\n",
      "[proc 0][Train] 1 steps take 1.271 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.192, backward: 0.002, update: 1.076\n",
      "[proc 1][Train](301/100000) average pos_loss: 0.321800172328949\n",
      "[proc 1][Train](301/100000) average neg_loss: 0.351040244102478\n",
      "[proc 1][Train](301/100000) average loss: 0.3364202082157135\n",
      "[proc 1][Train](301/100000) average regularization: 0.01519883144646883\n",
      "[proc 1][Train] 1 steps take 1.274 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.056\n",
      "[proc 0][Train](303/100000) average pos_loss: 0.35205337405204773\n",
      "[proc 0][Train](303/100000) average neg_loss: 0.3562449514865875\n",
      "[proc 0][Train](303/100000) average loss: 0.3541491627693176\n",
      "[proc 0][Train](303/100000) average regularization: 0.015049695037305355\n",
      "[proc 0][Train] 1 steps take 1.295 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.081\n",
      "[proc 1][Train](302/100000) average pos_loss: 0.33044880628585815\n",
      "[proc 1][Train](302/100000) average neg_loss: 0.6324598789215088\n",
      "[proc 1][Train](302/100000) average loss: 0.48145434260368347\n",
      "[proc 1][Train](302/100000) average regularization: 0.015286369249224663\n",
      "[proc 1][Train] 1 steps take 1.297 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](304/100000) average pos_loss: 0.30850350856781006\n",
      "[proc 0][Train](304/100000) average neg_loss: 0.71781325340271\n",
      "[proc 0][Train](304/100000) average loss: 0.51315838098526\n",
      "[proc 0][Train](304/100000) average regularization: 0.015379222109913826\n",
      "[proc 0][Train] 1 steps take 1.314 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.002, update: 1.101\n",
      "[proc 1][Train](303/100000) average pos_loss: 0.3178848624229431\n",
      "[proc 1][Train](303/100000) average neg_loss: 0.34650829434394836\n",
      "[proc 1][Train](303/100000) average loss: 0.33219659328460693\n",
      "[proc 1][Train](303/100000) average regularization: 0.015255453065037727\n",
      "[proc 1][Train] 1 steps take 1.284 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.067\n",
      "[proc 0][Train](305/100000) average pos_loss: 0.3567350506782532\n",
      "[proc 0][Train](305/100000) average neg_loss: 0.33690083026885986\n",
      "[proc 0][Train](305/100000) average loss: 0.3468179404735565\n",
      "[proc 0][Train](305/100000) average regularization: 0.014986823312938213\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.213, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](304/100000) average pos_loss: 0.33378350734710693\n",
      "[proc 1][Train](304/100000) average neg_loss: 0.6602312922477722\n",
      "[proc 1][Train](304/100000) average loss: 0.4970073997974396\n",
      "[proc 1][Train](304/100000) average regularization: 0.015061498619616032\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.089\n",
      "[proc 0][Train](306/100000) average pos_loss: 0.3111613988876343\n",
      "[proc 0][Train](306/100000) average neg_loss: 0.6630013585090637\n",
      "[proc 0][Train](306/100000) average loss: 0.487081378698349\n",
      "[proc 0][Train](306/100000) average regularization: 0.015362894162535667\n",
      "[proc 0][Train] 1 steps take 1.324 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.213, backward: 0.003, update: 1.093\n",
      "[proc 1][Train](305/100000) average pos_loss: 0.32657480239868164\n",
      "[proc 1][Train](305/100000) average neg_loss: 0.3143821656703949\n",
      "[proc 1][Train](305/100000) average loss: 0.32047849893569946\n",
      "[proc 1][Train](305/100000) average regularization: 0.015033036470413208\n",
      "[proc 1][Train] 1 steps take 1.324 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.215, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](307/100000) average pos_loss: 0.3625020384788513\n",
      "[proc 0][Train](307/100000) average neg_loss: 0.323757529258728\n",
      "[proc 0][Train](307/100000) average loss: 0.3431297838687897\n",
      "[proc 0][Train](307/100000) average regularization: 0.014904801733791828\n",
      "[proc 0][Train] 1 steps take 1.326 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.112\n",
      "[proc 1][Train](306/100000) average pos_loss: 0.327026903629303\n",
      "[proc 1][Train](306/100000) average neg_loss: 0.6344643831253052\n",
      "[proc 1][Train](306/100000) average loss: 0.4807456433773041\n",
      "[proc 1][Train](306/100000) average regularization: 0.015165070071816444\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.214, backward: 0.003, update: 1.069\n",
      "[proc 0][Train](308/100000) average pos_loss: 0.301014244556427\n",
      "[proc 0][Train](308/100000) average neg_loss: 0.6799921989440918\n",
      "[proc 0][Train](308/100000) average loss: 0.4905032217502594\n",
      "[proc 0][Train](308/100000) average regularization: 0.015499272383749485\n",
      "[proc 0][Train] 1 steps take 1.341 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.200, backward: 0.003, update: 1.137\n",
      "[proc 1][Train](307/100000) average pos_loss: 0.3195313811302185\n",
      "[proc 1][Train](307/100000) average neg_loss: 0.35541945695877075\n",
      "[proc 1][Train](307/100000) average loss: 0.33747541904449463\n",
      "[proc 1][Train](307/100000) average regularization: 0.015236089006066322\n",
      "[proc 1][Train] 1 steps take 1.288 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.003, update: 1.077\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](309/100000) average pos_loss: 0.34375351667404175\n",
      "[proc 0][Train](309/100000) average neg_loss: 0.31486600637435913\n",
      "[proc 0][Train](309/100000) average loss: 0.32930976152420044\n",
      "[proc 0][Train](309/100000) average regularization: 0.015276026912033558\n",
      "[proc 0][Train] 1 steps take 1.326 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.108\n",
      "[proc 1][Train](308/100000) average pos_loss: 0.33612823486328125\n",
      "[proc 1][Train](308/100000) average neg_loss: 0.6869352459907532\n",
      "[proc 1][Train](308/100000) average loss: 0.5115317106246948\n",
      "[proc 1][Train](308/100000) average regularization: 0.015167301520705223\n",
      "[proc 1][Train] 1 steps take 1.298 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.202, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](310/100000) average pos_loss: 0.31486976146698\n",
      "[proc 0][Train](310/100000) average neg_loss: 0.7569669485092163\n",
      "[proc 0][Train](310/100000) average loss: 0.5359183549880981\n",
      "[proc 0][Train](310/100000) average regularization: 0.01534111239016056\n",
      "[proc 0][Train] 1 steps take 1.341 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.124\n",
      "[proc 1][Train](309/100000) average pos_loss: 0.3347611129283905\n",
      "[proc 1][Train](309/100000) average neg_loss: 0.32704150676727295\n",
      "[proc 1][Train](309/100000) average loss: 0.3309013247489929\n",
      "[proc 1][Train](309/100000) average regularization: 0.015111253596842289\n",
      "[proc 1][Train] 1 steps take 1.370 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.153\n",
      "[proc 0][Train](311/100000) average pos_loss: 0.3809465765953064\n",
      "[proc 0][Train](311/100000) average neg_loss: 0.31514906883239746\n",
      "[proc 0][Train](311/100000) average loss: 0.34804782271385193\n",
      "[proc 0][Train](311/100000) average regularization: 0.014908071607351303\n",
      "[proc 0][Train] 1 steps take 1.324 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.108\n",
      "[proc 1][Train](310/100000) average pos_loss: 0.3403352200984955\n",
      "[proc 1][Train](310/100000) average neg_loss: 0.597651481628418\n",
      "[proc 1][Train](310/100000) average loss: 0.4689933657646179\n",
      "[proc 1][Train](310/100000) average regularization: 0.015108582563698292\n",
      "[proc 1][Train] 1 steps take 1.331 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.219, backward: 0.003, update: 1.108\n",
      "[proc 0][Train](312/100000) average pos_loss: 0.31246495246887207\n",
      "[proc 0][Train](312/100000) average neg_loss: 0.6574059724807739\n",
      "[proc 0][Train](312/100000) average loss: 0.484935462474823\n",
      "[proc 0][Train](312/100000) average regularization: 0.015241795219480991\n",
      "[proc 0][Train] 1 steps take 1.283 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.066\n",
      "[proc 1][Train](311/100000) average pos_loss: 0.3102332353591919\n",
      "[proc 1][Train](311/100000) average neg_loss: 0.3561649024486542\n",
      "[proc 1][Train](311/100000) average loss: 0.33319908380508423\n",
      "[proc 1][Train](311/100000) average regularization: 0.015231706202030182\n",
      "[proc 1][Train] 1 steps take 1.316 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.099\n",
      "[proc 0][Train](313/100000) average pos_loss: 0.34708213806152344\n",
      "[proc 0][Train](313/100000) average neg_loss: 0.35975876450538635\n",
      "[proc 0][Train](313/100000) average loss: 0.3534204363822937\n",
      "[proc 0][Train](313/100000) average regularization: 0.015079629607498646\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](312/100000) average pos_loss: 0.3306373953819275\n",
      "[proc 1][Train](312/100000) average neg_loss: 0.6901409029960632\n",
      "[proc 1][Train](312/100000) average loss: 0.5103891491889954\n",
      "[proc 1][Train](312/100000) average regularization: 0.014963309280574322\n",
      "[proc 1][Train] 1 steps take 1.346 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.002, update: 1.126\n",
      "[proc 0][Train](314/100000) average pos_loss: 0.3095235228538513\n",
      "[proc 0][Train](314/100000) average neg_loss: 0.6878240704536438\n",
      "[proc 0][Train](314/100000) average loss: 0.49867379665374756\n",
      "[proc 0][Train](314/100000) average regularization: 0.015149613842368126\n",
      "[proc 0][Train] 1 steps take 1.314 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.004, update: 1.095\n",
      "[proc 1][Train](313/100000) average pos_loss: 0.33490920066833496\n",
      "[proc 1][Train](313/100000) average neg_loss: 0.32106295228004456\n",
      "[proc 1][Train](313/100000) average loss: 0.32798606157302856\n",
      "[proc 1][Train](313/100000) average regularization: 0.015027277171611786\n",
      "[proc 1][Train] 1 steps take 1.291 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.219, backward: 0.003, update: 1.067\n",
      "[proc 0][Train](315/100000) average pos_loss: 0.36709558963775635\n",
      "[proc 0][Train](315/100000) average neg_loss: 0.3195630609989166\n",
      "[proc 0][Train](315/100000) average loss: 0.3433293104171753\n",
      "[proc 0][Train](315/100000) average regularization: 0.01491662859916687\n",
      "[proc 0][Train] 1 steps take 1.275 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.196, backward: 0.004, update: 1.073\n",
      "[proc 1][Train](314/100000) average pos_loss: 0.335800439119339\n",
      "[proc 1][Train](314/100000) average neg_loss: 0.6246857047080994\n",
      "[proc 1][Train](314/100000) average loss: 0.48024308681488037\n",
      "[proc 1][Train](314/100000) average regularization: 0.015115148387849331\n",
      "[proc 1][Train] 1 steps take 1.314 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.093\n",
      "[proc 0][Train](316/100000) average pos_loss: 0.3035280704498291\n",
      "[proc 0][Train](316/100000) average neg_loss: 0.723585844039917\n",
      "[proc 0][Train](316/100000) average loss: 0.513556957244873\n",
      "[proc 0][Train](316/100000) average regularization: 0.01528650987893343\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.087\n",
      "[proc 1][Train](315/100000) average pos_loss: 0.3223244547843933\n",
      "[proc 1][Train](315/100000) average neg_loss: 0.3604600727558136\n",
      "[proc 1][Train](315/100000) average loss: 0.34139227867126465\n",
      "[proc 1][Train](315/100000) average regularization: 0.015015656128525734\n",
      "[proc 1][Train] 1 steps take 1.267 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.204, backward: 0.003, update: 1.059\n",
      "[proc 0][Train](317/100000) average pos_loss: 0.36646756529808044\n",
      "[proc 0][Train](317/100000) average neg_loss: 0.30763542652130127\n",
      "[proc 0][Train](317/100000) average loss: 0.33705151081085205\n",
      "[proc 0][Train](317/100000) average regularization: 0.014896075241267681\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.004, update: 1.088\n",
      "[proc 1][Train](316/100000) average pos_loss: 0.34608396887779236\n",
      "[proc 1][Train](316/100000) average neg_loss: 0.5700137615203857\n",
      "[proc 1][Train](316/100000) average loss: 0.45804888010025024\n",
      "[proc 1][Train](316/100000) average regularization: 0.015061631798744202\n",
      "[proc 1][Train] 1 steps take 1.300 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.003, update: 1.088\n",
      "[proc 0][Train](318/100000) average pos_loss: 0.3070756793022156\n",
      "[proc 0][Train](318/100000) average neg_loss: 0.7243032455444336\n",
      "[proc 0][Train](318/100000) average loss: 0.515689492225647\n",
      "[proc 0][Train](318/100000) average regularization: 0.015263303183019161\n",
      "[proc 0][Train] 1 steps take 1.287 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](317/100000) average pos_loss: 0.3078809082508087\n",
      "[proc 1][Train](317/100000) average neg_loss: 0.3670597970485687\n",
      "[proc 1][Train](317/100000) average loss: 0.3374703526496887\n",
      "[proc 1][Train](317/100000) average regularization: 0.01518987026065588\n",
      "[proc 1][Train] 1 steps take 1.295 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.079\n",
      "[proc 0][Train](319/100000) average pos_loss: 0.3407798111438751\n",
      "[proc 0][Train](319/100000) average neg_loss: 0.3657313883304596\n",
      "[proc 0][Train](319/100000) average loss: 0.35325559973716736\n",
      "[proc 0][Train](319/100000) average regularization: 0.01504749245941639\n",
      "[proc 0][Train] 1 steps take 1.296 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.079\n",
      "[proc 1][Train](318/100000) average pos_loss: 0.3350386619567871\n",
      "[proc 1][Train](318/100000) average neg_loss: 0.6346094608306885\n",
      "[proc 1][Train](318/100000) average loss: 0.4848240613937378\n",
      "[proc 1][Train](318/100000) average regularization: 0.014978453516960144\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.003, update: 1.100\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](320/100000) average pos_loss: 0.3173139691352844\n",
      "[proc 0][Train](320/100000) average neg_loss: 0.6595077514648438\n",
      "[proc 0][Train](320/100000) average loss: 0.4884108603000641\n",
      "[proc 0][Train](320/100000) average regularization: 0.015076537616550922\n",
      "[proc 0][Train] 1 steps take 1.299 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.086\n",
      "[proc 1][Train](319/100000) average pos_loss: 0.3398570716381073\n",
      "[proc 1][Train](319/100000) average neg_loss: 0.3670007586479187\n",
      "[proc 1][Train](319/100000) average loss: 0.3534289002418518\n",
      "[proc 1][Train](319/100000) average regularization: 0.014967051334679127\n",
      "[proc 1][Train] 1 steps take 1.282 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.073\n",
      "[proc 0][Train](321/100000) average pos_loss: 0.3571981191635132\n",
      "[proc 0][Train](321/100000) average neg_loss: 0.30771195888519287\n",
      "[proc 0][Train](321/100000) average loss: 0.332455039024353\n",
      "[proc 0][Train](321/100000) average regularization: 0.014944720081984997\n",
      "[proc 0][Train] 1 steps take 1.300 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.210, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](320/100000) average pos_loss: 0.32916390895843506\n",
      "[proc 1][Train](320/100000) average neg_loss: 0.6350713968276978\n",
      "[proc 1][Train](320/100000) average loss: 0.4821176528930664\n",
      "[proc 1][Train](320/100000) average regularization: 0.0150447404012084\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.090\n",
      "[proc 0][Train](322/100000) average pos_loss: 0.2956416606903076\n",
      "[proc 0][Train](322/100000) average neg_loss: 0.70127934217453\n",
      "[proc 0][Train](322/100000) average loss: 0.4984605014324188\n",
      "[proc 0][Train](322/100000) average regularization: 0.015223611146211624\n",
      "[proc 0][Train] 1 steps take 1.280 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.195, backward: 0.003, update: 1.066\n",
      "[proc 1][Train](321/100000) average pos_loss: 0.3288393020629883\n",
      "[proc 1][Train](321/100000) average neg_loss: 0.3444504737854004\n",
      "[proc 1][Train](321/100000) average loss: 0.33664488792419434\n",
      "[proc 1][Train](321/100000) average regularization: 0.015176394023001194\n",
      "[proc 1][Train] 1 steps take 1.309 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.212, backward: 0.003, update: 1.077\n",
      "[proc 0][Train](323/100000) average pos_loss: 0.36112144589424133\n",
      "[proc 0][Train](323/100000) average neg_loss: 0.345630407333374\n",
      "[proc 0][Train](323/100000) average loss: 0.3533759117126465\n",
      "[proc 0][Train](323/100000) average regularization: 0.015076631680130959\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.198, backward: 0.002, update: 1.102\n",
      "[proc 1][Train](322/100000) average pos_loss: 0.33034586906433105\n",
      "[proc 1][Train](322/100000) average neg_loss: 0.5705215930938721\n",
      "[proc 1][Train](322/100000) average loss: 0.45043373107910156\n",
      "[proc 1][Train](322/100000) average regularization: 0.015033303759992123\n",
      "[proc 1][Train] 1 steps take 1.390 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.213, backward: 0.003, update: 1.156\n",
      "[proc 0][Train](324/100000) average pos_loss: 0.31065288186073303\n",
      "[proc 0][Train](324/100000) average neg_loss: 0.689688503742218\n",
      "[proc 0][Train](324/100000) average loss: 0.5001707077026367\n",
      "[proc 0][Train](324/100000) average regularization: 0.015153666958212852\n",
      "[proc 0][Train] 1 steps take 1.322 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.102\n",
      "[proc 1][Train](323/100000) average pos_loss: 0.32396554946899414\n",
      "[proc 1][Train](323/100000) average neg_loss: 0.3573049306869507\n",
      "[proc 1][Train](323/100000) average loss: 0.3406352400779724\n",
      "[proc 1][Train](323/100000) average regularization: 0.015124917030334473\n",
      "[proc 1][Train] 1 steps take 1.346 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.127\n",
      "[proc 0][Train](325/100000) average pos_loss: 0.34754514694213867\n",
      "[proc 0][Train](325/100000) average neg_loss: 0.33700448274612427\n",
      "[proc 0][Train](325/100000) average loss: 0.34227481484413147\n",
      "[proc 0][Train](325/100000) average regularization: 0.014938926324248314\n",
      "[proc 0][Train] 1 steps take 1.336 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.220, backward: 0.003, update: 1.112\n",
      "[proc 1][Train](324/100000) average pos_loss: 0.32877376675605774\n",
      "[proc 1][Train](324/100000) average neg_loss: 0.6616734266281128\n",
      "[proc 1][Train](324/100000) average loss: 0.4952235817909241\n",
      "[proc 1][Train](324/100000) average regularization: 0.015210689045488834\n",
      "[proc 1][Train] 1 steps take 1.337 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.123\n",
      "[proc 0][Train](326/100000) average pos_loss: 0.305165559053421\n",
      "[proc 0][Train](326/100000) average neg_loss: 0.6775180101394653\n",
      "[proc 0][Train](326/100000) average loss: 0.491341769695282\n",
      "[proc 0][Train](326/100000) average regularization: 0.015325427055358887\n",
      "[proc 0][Train] 1 steps take 1.377 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.162\n",
      "[proc 1][Train](325/100000) average pos_loss: 0.31856128573417664\n",
      "[proc 1][Train](325/100000) average neg_loss: 0.3294677734375\n",
      "[proc 1][Train](325/100000) average loss: 0.3240145444869995\n",
      "[proc 1][Train](325/100000) average regularization: 0.015118267387151718\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.202, backward: 0.003, update: 1.105\n",
      "[proc 0][Train](327/100000) average pos_loss: 0.3472832143306732\n",
      "[proc 0][Train](327/100000) average neg_loss: 0.3442508578300476\n",
      "[proc 0][Train](327/100000) average loss: 0.3457670211791992\n",
      "[proc 0][Train](327/100000) average regularization: 0.015034383162856102\n",
      "[proc 0][Train] 1 steps take 1.291 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.196, backward: 0.002, update: 1.091\n",
      "[proc 1][Train](326/100000) average pos_loss: 0.32694023847579956\n",
      "[proc 1][Train](326/100000) average neg_loss: 0.6603686213493347\n",
      "[proc 1][Train](326/100000) average loss: 0.49365442991256714\n",
      "[proc 1][Train](326/100000) average regularization: 0.015034984797239304\n",
      "[proc 1][Train] 1 steps take 1.312 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.102\n",
      "[proc 0][Train](328/100000) average pos_loss: 0.31708914041519165\n",
      "[proc 0][Train](328/100000) average neg_loss: 0.7215451002120972\n",
      "[proc 0][Train](328/100000) average loss: 0.5193171501159668\n",
      "[proc 0][Train](328/100000) average regularization: 0.015209582634270191\n",
      "[proc 0][Train] 1 steps take 1.341 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.126\n",
      "[proc 1][Train](327/100000) average pos_loss: 0.3342338502407074\n",
      "[proc 1][Train](327/100000) average neg_loss: 0.33475857973098755\n",
      "[proc 1][Train](327/100000) average loss: 0.3344962000846863\n",
      "[proc 1][Train](327/100000) average regularization: 0.01505845133215189\n",
      "[proc 1][Train] 1 steps take 1.318 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.213, backward: 0.003, update: 1.100\n",
      "[proc 0][Train](329/100000) average pos_loss: 0.3644598126411438\n",
      "[proc 0][Train](329/100000) average neg_loss: 0.3020765781402588\n",
      "[proc 0][Train](329/100000) average loss: 0.3332681953907013\n",
      "[proc 0][Train](329/100000) average regularization: 0.014883498661220074\n",
      "[proc 0][Train] 1 steps take 1.440 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.002, update: 1.220\n",
      "[proc 1][Train](328/100000) average pos_loss: 0.3432014286518097\n",
      "[proc 1][Train](328/100000) average neg_loss: 0.6168540120124817\n",
      "[proc 1][Train](328/100000) average loss: 0.4800277352333069\n",
      "[proc 1][Train](328/100000) average regularization: 0.014905845746397972\n",
      "[proc 1][Train] 1 steps take 1.297 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.204, backward: 0.003, update: 1.090\n",
      "[proc 0][Train](330/100000) average pos_loss: 0.31136876344680786\n",
      "[proc 0][Train](330/100000) average neg_loss: 0.694851815700531\n",
      "[proc 0][Train](330/100000) average loss: 0.5031102895736694\n",
      "[proc 0][Train](330/100000) average regularization: 0.015142399817705154\n",
      "[proc 0][Train] 1 steps take 1.271 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.197, backward: 0.003, update: 1.070\n",
      "[proc 1][Train](329/100000) average pos_loss: 0.31736212968826294\n",
      "[proc 1][Train](329/100000) average neg_loss: 0.34717273712158203\n",
      "[proc 1][Train](329/100000) average loss: 0.3322674334049225\n",
      "[proc 1][Train](329/100000) average regularization: 0.015170590952038765\n",
      "[proc 1][Train] 1 steps take 1.287 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.072\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](331/100000) average pos_loss: 0.3493880331516266\n",
      "[proc 0][Train](331/100000) average neg_loss: 0.3536442518234253\n",
      "[proc 0][Train](331/100000) average loss: 0.35151612758636475\n",
      "[proc 0][Train](331/100000) average regularization: 0.014841921627521515\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.086\n",
      "[proc 1][Train](330/100000) average pos_loss: 0.33174625039100647\n",
      "[proc 1][Train](330/100000) average neg_loss: 0.6557189226150513\n",
      "[proc 1][Train](330/100000) average loss: 0.4937325716018677\n",
      "[proc 1][Train](330/100000) average regularization: 0.015108844265341759\n",
      "[proc 1][Train] 1 steps take 1.412 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.265, backward: 0.003, update: 1.143\n",
      "[proc 0][Train](332/100000) average pos_loss: 0.3155990540981293\n",
      "[proc 0][Train](332/100000) average neg_loss: 0.7302566766738892\n",
      "[proc 0][Train](332/100000) average loss: 0.5229278802871704\n",
      "[proc 0][Train](332/100000) average regularization: 0.015035858377814293\n",
      "[proc 0][Train] 1 steps take 1.327 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.106\n",
      "[proc 1][Train](331/100000) average pos_loss: 0.3340228497982025\n",
      "[proc 1][Train](331/100000) average neg_loss: 0.3275812864303589\n",
      "[proc 1][Train](331/100000) average loss: 0.3308020830154419\n",
      "[proc 1][Train](331/100000) average regularization: 0.014972136355936527\n",
      "[proc 1][Train] 1 steps take 1.300 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.208, backward: 0.003, update: 1.087\n",
      "[proc 0][Train](333/100000) average pos_loss: 0.3666280508041382\n",
      "[proc 0][Train](333/100000) average neg_loss: 0.3117106556892395\n",
      "[proc 0][Train](333/100000) average loss: 0.33916935324668884\n",
      "[proc 0][Train](333/100000) average regularization: 0.01482480764389038\n",
      "[proc 0][Train] 1 steps take 1.297 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.082\n",
      "[proc 1][Train](332/100000) average pos_loss: 0.3424956500530243\n",
      "[proc 1][Train](332/100000) average neg_loss: 0.6242308616638184\n",
      "[proc 1][Train](332/100000) average loss: 0.4833632707595825\n",
      "[proc 1][Train](332/100000) average regularization: 0.01486220769584179\n",
      "[proc 1][Train] 1 steps take 1.275 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.208, backward: 0.003, update: 1.063\n",
      "[proc 0][Train](334/100000) average pos_loss: 0.3076665997505188\n",
      "[proc 0][Train](334/100000) average neg_loss: 0.7332661747932434\n",
      "[proc 0][Train](334/100000) average loss: 0.5204663872718811\n",
      "[proc 0][Train](334/100000) average regularization: 0.015157396905124187\n",
      "[proc 0][Train] 1 steps take 1.325 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.107\n",
      "[proc 1][Train](333/100000) average pos_loss: 0.319542795419693\n",
      "[proc 1][Train](333/100000) average neg_loss: 0.35340389609336853\n",
      "[proc 1][Train](333/100000) average loss: 0.33647334575653076\n",
      "[proc 1][Train](333/100000) average regularization: 0.015113497152924538\n",
      "[proc 1][Train] 1 steps take 1.295 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.076\n",
      "[proc 0][Train](335/100000) average pos_loss: 0.35322561860084534\n",
      "[proc 0][Train](335/100000) average neg_loss: 0.3348159193992615\n",
      "[proc 0][Train](335/100000) average loss: 0.3440207839012146\n",
      "[proc 0][Train](335/100000) average regularization: 0.01497307512909174\n",
      "[proc 0][Train] 1 steps take 1.323 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.196, backward: 0.002, update: 1.123\n",
      "[proc 1][Train](334/100000) average pos_loss: 0.34167078137397766\n",
      "[proc 1][Train](334/100000) average neg_loss: 0.6035473346710205\n",
      "[proc 1][Train](334/100000) average loss: 0.4726090431213379\n",
      "[proc 1][Train](334/100000) average regularization: 0.01486114040017128\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.087\n",
      "[proc 0][Train](336/100000) average pos_loss: 0.32307153940200806\n",
      "[proc 0][Train](336/100000) average neg_loss: 0.699343204498291\n",
      "[proc 0][Train](336/100000) average loss: 0.5112073421478271\n",
      "[proc 0][Train](336/100000) average regularization: 0.015116262249648571\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.087\n",
      "[proc 1][Train](335/100000) average pos_loss: 0.3201364278793335\n",
      "[proc 1][Train](335/100000) average neg_loss: 0.33989423513412476\n",
      "[proc 1][Train](335/100000) average loss: 0.3300153315067291\n",
      "[proc 1][Train](335/100000) average regularization: 0.015049360692501068\n",
      "[proc 1][Train] 1 steps take 1.288 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.072\n",
      "[proc 0][Train](337/100000) average pos_loss: 0.36248424649238586\n",
      "[proc 0][Train](337/100000) average neg_loss: 0.3221568465232849\n",
      "[proc 0][Train](337/100000) average loss: 0.3423205614089966\n",
      "[proc 0][Train](337/100000) average regularization: 0.014769665896892548\n",
      "[proc 0][Train] 1 steps take 1.318 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.211, backward: 0.003, update: 1.086\n",
      "[proc 1][Train](336/100000) average pos_loss: 0.3325861692428589\n",
      "[proc 1][Train](336/100000) average neg_loss: 0.6240257024765015\n",
      "[proc 1][Train](336/100000) average loss: 0.4783059358596802\n",
      "[proc 1][Train](336/100000) average regularization: 0.014989778399467468\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.213, backward: 0.003, update: 1.076\n",
      "[proc 0][Train](338/100000) average pos_loss: 0.30518269538879395\n",
      "[proc 0][Train](338/100000) average neg_loss: 0.7484732866287231\n",
      "[proc 0][Train](338/100000) average loss: 0.5268279910087585\n",
      "[proc 0][Train](338/100000) average regularization: 0.01514038722962141\n",
      "[proc 0][Train] 1 steps take 1.337 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.213, backward: 0.003, update: 1.105\n",
      "[proc 1][Train](337/100000) average pos_loss: 0.32136011123657227\n",
      "[proc 1][Train](337/100000) average neg_loss: 0.35397011041641235\n",
      "[proc 1][Train](337/100000) average loss: 0.3376651108264923\n",
      "[proc 1][Train](337/100000) average regularization: 0.014979253523051739\n",
      "[proc 1][Train] 1 steps take 1.323 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.213, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](339/100000) average pos_loss: 0.3670988082885742\n",
      "[proc 0][Train](339/100000) average neg_loss: 0.312115341424942\n",
      "[proc 0][Train](339/100000) average loss: 0.3396070599555969\n",
      "[proc 0][Train](339/100000) average regularization: 0.014811626635491848\n",
      "[proc 0][Train] 1 steps take 1.303 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.088\n",
      "[proc 1][Train](338/100000) average pos_loss: 0.3451802730560303\n",
      "[proc 1][Train](338/100000) average neg_loss: 0.6369085311889648\n",
      "[proc 1][Train](338/100000) average loss: 0.49104440212249756\n",
      "[proc 1][Train](338/100000) average regularization: 0.014859006740152836\n",
      "[proc 1][Train] 1 steps take 1.319 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.210, backward: 0.003, update: 1.091\n",
      "[proc 0][Train](340/100000) average pos_loss: 0.3100714683532715\n",
      "[proc 0][Train](340/100000) average neg_loss: 0.6726857423782349\n",
      "[proc 0][Train](340/100000) average loss: 0.4913786053657532\n",
      "[proc 0][Train](340/100000) average regularization: 0.01498747244477272\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.088\n",
      "[proc 1][Train](339/100000) average pos_loss: 0.3304789066314697\n",
      "[proc 1][Train](339/100000) average neg_loss: 0.32937729358673096\n",
      "[proc 1][Train](339/100000) average loss: 0.32992810010910034\n",
      "[proc 1][Train](339/100000) average regularization: 0.014875920489430428\n",
      "[proc 1][Train] 1 steps take 1.282 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.198, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](341/100000) average pos_loss: 0.35701847076416016\n",
      "[proc 0][Train](341/100000) average neg_loss: 0.314018577337265\n",
      "[proc 0][Train](341/100000) average loss: 0.3355185389518738\n",
      "[proc 0][Train](341/100000) average regularization: 0.014773362316191196\n",
      "[proc 0][Train] 1 steps take 1.258 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.043\n",
      "[proc 1][Train](340/100000) average pos_loss: 0.337360143661499\n",
      "[proc 1][Train](340/100000) average neg_loss: 0.677440881729126\n",
      "[proc 1][Train](340/100000) average loss: 0.5074005126953125\n",
      "[proc 1][Train](340/100000) average regularization: 0.014892494305968285\n",
      "[proc 1][Train] 1 steps take 1.319 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.097\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](342/100000) average pos_loss: 0.30711305141448975\n",
      "[proc 0][Train](342/100000) average neg_loss: 0.710051417350769\n",
      "[proc 0][Train](342/100000) average loss: 0.5085822343826294\n",
      "[proc 0][Train](342/100000) average regularization: 0.015143835917115211\n",
      "[proc 0][Train] 1 steps take 1.300 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.204, backward: 0.003, update: 1.092\n",
      "[proc 1][Train](341/100000) average pos_loss: 0.3236681818962097\n",
      "[proc 1][Train](341/100000) average neg_loss: 0.3249400854110718\n",
      "[proc 1][Train](341/100000) average loss: 0.32430413365364075\n",
      "[proc 1][Train](341/100000) average regularization: 0.014910345897078514\n",
      "[proc 1][Train] 1 steps take 1.310 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](343/100000) average pos_loss: 0.35947299003601074\n",
      "[proc 0][Train](343/100000) average neg_loss: 0.3176184892654419\n",
      "[proc 0][Train](343/100000) average loss: 0.3385457396507263\n",
      "[proc 0][Train](343/100000) average regularization: 0.014845745638012886\n",
      "[proc 0][Train] 1 steps take 1.367 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.152\n",
      "[proc 1][Train](342/100000) average pos_loss: 0.3346937298774719\n",
      "[proc 1][Train](342/100000) average neg_loss: 0.6317704916000366\n",
      "[proc 1][Train](342/100000) average loss: 0.4832321107387543\n",
      "[proc 1][Train](342/100000) average regularization: 0.01484691258519888\n",
      "[proc 1][Train] 1 steps take 1.347 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.232, backward: 0.003, update: 1.110\n",
      "[proc 0][Train](344/100000) average pos_loss: 0.3100234270095825\n",
      "[proc 0][Train](344/100000) average neg_loss: 0.6479048728942871\n",
      "[proc 0][Train](344/100000) average loss: 0.4789641499519348\n",
      "[proc 0][Train](344/100000) average regularization: 0.015122375451028347\n",
      "[proc 0][Train] 1 steps take 1.281 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.218, backward: 0.002, update: 1.060\n",
      "[proc 1][Train](343/100000) average pos_loss: 0.32157406210899353\n",
      "[proc 1][Train](343/100000) average neg_loss: 0.3319999575614929\n",
      "[proc 1][Train](343/100000) average loss: 0.32678699493408203\n",
      "[proc 1][Train](343/100000) average regularization: 0.014987359754741192\n",
      "[proc 1][Train] 1 steps take 1.306 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.097\n",
      "[proc 0][Train](345/100000) average pos_loss: 0.3504777252674103\n",
      "[proc 0][Train](345/100000) average neg_loss: 0.3240455090999603\n",
      "[proc 0][Train](345/100000) average loss: 0.3372616171836853\n",
      "[proc 0][Train](345/100000) average regularization: 0.014883012510836124\n",
      "[proc 0][Train] 1 steps take 1.296 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.082\n",
      "[proc 1][Train](344/100000) average pos_loss: 0.3319563865661621\n",
      "[proc 1][Train](344/100000) average neg_loss: 0.6869055032730103\n",
      "[proc 1][Train](344/100000) average loss: 0.5094309449195862\n",
      "[proc 1][Train](344/100000) average regularization: 0.015100025571882725\n",
      "[proc 1][Train] 1 steps take 1.297 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.212, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](346/100000) average pos_loss: 0.3120783567428589\n",
      "[proc 0][Train](346/100000) average neg_loss: 0.7614527344703674\n",
      "[proc 0][Train](346/100000) average loss: 0.5367655754089355\n",
      "[proc 0][Train](346/100000) average regularization: 0.01518664974719286\n",
      "[proc 0][Train] 1 steps take 1.265 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.201, backward: 0.003, update: 1.059\n",
      "[proc 1][Train](345/100000) average pos_loss: 0.329008549451828\n",
      "[proc 1][Train](345/100000) average neg_loss: 0.3331184387207031\n",
      "[proc 1][Train](345/100000) average loss: 0.33106350898742676\n",
      "[proc 1][Train](345/100000) average regularization: 0.014887413941323757\n",
      "[proc 1][Train] 1 steps take 1.344 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.126\n",
      "[proc 0][Train](347/100000) average pos_loss: 0.3729029893875122\n",
      "[proc 0][Train](347/100000) average neg_loss: 0.32194191217422485\n",
      "[proc 0][Train](347/100000) average loss: 0.34742245078086853\n",
      "[proc 0][Train](347/100000) average regularization: 0.014771437272429466\n",
      "[proc 0][Train] 1 steps take 1.326 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.223, backward: 0.003, update: 1.099\n",
      "[proc 1][Train](346/100000) average pos_loss: 0.34312230348587036\n",
      "[proc 1][Train](346/100000) average neg_loss: 0.6187034845352173\n",
      "[proc 1][Train](346/100000) average loss: 0.4809128940105438\n",
      "[proc 1][Train](346/100000) average regularization: 0.014816890470683575\n",
      "[proc 1][Train] 1 steps take 1.309 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.092\n",
      "[proc 0][Train](348/100000) average pos_loss: 0.31267231702804565\n",
      "[proc 0][Train](348/100000) average neg_loss: 0.6769471168518066\n",
      "[proc 0][Train](348/100000) average loss: 0.49480971693992615\n",
      "[proc 0][Train](348/100000) average regularization: 0.015039033256471157\n",
      "[proc 0][Train] 1 steps take 1.312 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.096\n",
      "[proc 1][Train](347/100000) average pos_loss: 0.3260915279388428\n",
      "[proc 1][Train](347/100000) average neg_loss: 0.3684346675872803\n",
      "[proc 1][Train](347/100000) average loss: 0.3472630977630615\n",
      "[proc 1][Train](347/100000) average regularization: 0.014899111352860928\n",
      "[proc 1][Train] 1 steps take 1.284 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.066\n",
      "[proc 0][Train](349/100000) average pos_loss: 0.3487853407859802\n",
      "[proc 0][Train](349/100000) average neg_loss: 0.3171202540397644\n",
      "[proc 0][Train](349/100000) average loss: 0.3329527974128723\n",
      "[proc 0][Train](349/100000) average regularization: 0.014862479642033577\n",
      "[proc 0][Train] 1 steps take 1.297 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.080\n",
      "[proc 1][Train](348/100000) average pos_loss: 0.344232976436615\n",
      "[proc 1][Train](348/100000) average neg_loss: 0.5653060674667358\n",
      "[proc 1][Train](348/100000) average loss: 0.4547695219516754\n",
      "[proc 1][Train](348/100000) average regularization: 0.014904435724020004\n",
      "[proc 1][Train] 1 steps take 1.306 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.098\n",
      "[proc 0][Train](350/100000) average pos_loss: 0.3128579556941986\n",
      "[proc 0][Train](350/100000) average neg_loss: 0.7159802913665771\n",
      "[proc 0][Train](350/100000) average loss: 0.5144191384315491\n",
      "[proc 0][Train](350/100000) average regularization: 0.015073545277118683\n",
      "[proc 0][Train] 1 steps take 1.313 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.097\n",
      "[proc 1][Train](349/100000) average pos_loss: 0.3201703131198883\n",
      "[proc 1][Train](349/100000) average neg_loss: 0.3439549505710602\n",
      "[proc 1][Train](349/100000) average loss: 0.33206263184547424\n",
      "[proc 1][Train](349/100000) average regularization: 0.015163064002990723\n",
      "[proc 1][Train] 1 steps take 1.321 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.212, backward: 0.003, update: 1.104\n",
      "[proc 0][Train](351/100000) average pos_loss: 0.3621233105659485\n",
      "[proc 0][Train](351/100000) average neg_loss: 0.33495569229125977\n",
      "[proc 0][Train](351/100000) average loss: 0.3485395014286041\n",
      "[proc 0][Train](351/100000) average regularization: 0.014776882715523243\n",
      "[proc 0][Train] 1 steps take 1.293 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.002, update: 1.080\n",
      "[proc 1][Train](350/100000) average pos_loss: 0.3280021548271179\n",
      "[proc 1][Train](350/100000) average neg_loss: 0.6435628533363342\n",
      "[proc 1][Train](350/100000) average loss: 0.4857825040817261\n",
      "[proc 1][Train](350/100000) average regularization: 0.0149614829570055\n",
      "[proc 1][Train] 1 steps take 1.302 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.082\n",
      "[proc 0][Train](352/100000) average pos_loss: 0.3062697947025299\n",
      "[proc 0][Train](352/100000) average neg_loss: 0.6974581480026245\n",
      "[proc 0][Train](352/100000) average loss: 0.501863956451416\n",
      "[proc 0][Train](352/100000) average regularization: 0.01515335962176323\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.087\n",
      "[proc 1][Train](351/100000) average pos_loss: 0.325086772441864\n",
      "[proc 1][Train](351/100000) average neg_loss: 0.35567718744277954\n",
      "[proc 1][Train](351/100000) average loss: 0.3403819799423218\n",
      "[proc 1][Train](351/100000) average regularization: 0.015023277141153812\n",
      "[proc 1][Train] 1 steps take 1.279 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.062\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](353/100000) average pos_loss: 0.3638458251953125\n",
      "[proc 0][Train](353/100000) average neg_loss: 0.3302575349807739\n",
      "[proc 0][Train](353/100000) average loss: 0.3470516800880432\n",
      "[proc 0][Train](353/100000) average regularization: 0.01478207390755415\n",
      "[proc 0][Train] 1 steps take 1.286 seconds\n",
      "[proc 0]sample: 0.013, forward: 0.196, backward: 0.003, update: 1.073\n",
      "[proc 1][Train](352/100000) average pos_loss: 0.35127732157707214\n",
      "[proc 1][Train](352/100000) average neg_loss: 0.6517565846443176\n",
      "[proc 1][Train](352/100000) average loss: 0.5015169382095337\n",
      "[proc 1][Train](352/100000) average regularization: 0.014836643822491169\n",
      "[proc 1][Train] 1 steps take 1.336 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.116\n",
      "[proc 0][Train](354/100000) average pos_loss: 0.3168950080871582\n",
      "[proc 0][Train](354/100000) average neg_loss: 0.6857258677482605\n",
      "[proc 0][Train](354/100000) average loss: 0.5013104677200317\n",
      "[proc 0][Train](354/100000) average regularization: 0.015100773423910141\n",
      "[proc 0][Train] 1 steps take 1.315 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.212, backward: 0.003, update: 1.084\n",
      "[proc 1][Train](353/100000) average pos_loss: 0.332631915807724\n",
      "[proc 1][Train](353/100000) average neg_loss: 0.33054035902023315\n",
      "[proc 1][Train](353/100000) average loss: 0.3315861225128174\n",
      "[proc 1][Train](353/100000) average regularization: 0.01493246853351593\n",
      "[proc 1][Train] 1 steps take 1.404 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.246, backward: 0.002, update: 1.139\n",
      "[proc 0][Train](355/100000) average pos_loss: 0.36109310388565063\n",
      "[proc 0][Train](355/100000) average neg_loss: 0.33315879106521606\n",
      "[proc 0][Train](355/100000) average loss: 0.34712594747543335\n",
      "[proc 0][Train](355/100000) average regularization: 0.014806470833718777\n",
      "[proc 0][Train] 1 steps take 1.332 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.216, backward: 0.002, update: 1.112\n",
      "[proc 1][Train](354/100000) average pos_loss: 0.3256799578666687\n",
      "[proc 1][Train](354/100000) average neg_loss: 0.6684636473655701\n",
      "[proc 1][Train](354/100000) average loss: 0.4970718026161194\n",
      "[proc 1][Train](354/100000) average regularization: 0.014858190901577473\n",
      "[proc 1][Train] 1 steps take 1.277 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.210, backward: 0.003, update: 1.048\n",
      "[proc 0][Train](356/100000) average pos_loss: 0.3054732382297516\n",
      "[proc 0][Train](356/100000) average neg_loss: 0.675166130065918\n",
      "[proc 0][Train](356/100000) average loss: 0.4903196692466736\n",
      "[proc 0][Train](356/100000) average regularization: 0.015010681003332138\n",
      "[proc 0][Train] 1 steps take 1.330 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.114\n",
      "[proc 1][Train](355/100000) average pos_loss: 0.3360280394554138\n",
      "[proc 1][Train](355/100000) average neg_loss: 0.3247482180595398\n",
      "[proc 1][Train](355/100000) average loss: 0.3303881287574768\n",
      "[proc 1][Train](355/100000) average regularization: 0.014953600242733955\n",
      "[proc 1][Train] 1 steps take 1.322 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.103\n",
      "[proc 0][Train](357/100000) average pos_loss: 0.3637109398841858\n",
      "[proc 0][Train](357/100000) average neg_loss: 0.3304983377456665\n",
      "[proc 0][Train](357/100000) average loss: 0.34710463881492615\n",
      "[proc 0][Train](357/100000) average regularization: 0.014849836006760597\n",
      "[proc 0][Train] 1 steps take 1.312 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.098\n",
      "[proc 1][Train](356/100000) average pos_loss: 0.3388570547103882\n",
      "[proc 1][Train](356/100000) average neg_loss: 0.6392799615859985\n",
      "[proc 1][Train](356/100000) average loss: 0.48906850814819336\n",
      "[proc 1][Train](356/100000) average regularization: 0.014949353411793709\n",
      "[proc 1][Train] 1 steps take 1.275 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.056\n",
      "[proc 0][Train](358/100000) average pos_loss: 0.3081052005290985\n",
      "[proc 0][Train](358/100000) average neg_loss: 0.6736904978752136\n",
      "[proc 0][Train](358/100000) average loss: 0.4908978343009949\n",
      "[proc 0][Train](358/100000) average regularization: 0.015094372443854809\n",
      "[proc 0][Train] 1 steps take 1.301 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.002, update: 1.085\n",
      "[proc 1][Train](357/100000) average pos_loss: 0.3237046003341675\n",
      "[proc 1][Train](357/100000) average neg_loss: 0.30983561277389526\n",
      "[proc 1][Train](357/100000) average loss: 0.31677010655403137\n",
      "[proc 1][Train](357/100000) average regularization: 0.014946377836167812\n",
      "[proc 1][Train] 1 steps take 1.273 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.207, backward: 0.003, update: 1.062\n",
      "[proc 0][Train](359/100000) average pos_loss: 0.3573402464389801\n",
      "[proc 0][Train](359/100000) average neg_loss: 0.3469887375831604\n",
      "[proc 0][Train](359/100000) average loss: 0.35216450691223145\n",
      "[proc 0][Train](359/100000) average regularization: 0.014873236417770386\n",
      "[proc 0][Train] 1 steps take 1.311 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.095\n",
      "[proc 1][Train](358/100000) average pos_loss: 0.33450809121131897\n",
      "[proc 1][Train](358/100000) average neg_loss: 0.6938107013702393\n",
      "[proc 1][Train](358/100000) average loss: 0.5141593813896179\n",
      "[proc 1][Train](358/100000) average regularization: 0.015008951537311077\n",
      "[proc 1][Train] 1 steps take 1.267 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.212, backward: 0.003, update: 1.050\n",
      "[proc 0][Train](360/100000) average pos_loss: 0.311798095703125\n",
      "[proc 0][Train](360/100000) average neg_loss: 0.7689352035522461\n",
      "[proc 0][Train](360/100000) average loss: 0.5403666496276855\n",
      "[proc 0][Train](360/100000) average regularization: 0.015200897119939327\n",
      "[proc 0][Train] 1 steps take 1.319 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.100\n",
      "[proc 1][Train](359/100000) average pos_loss: 0.3359755277633667\n",
      "[proc 1][Train](359/100000) average neg_loss: 0.3189414143562317\n",
      "[proc 1][Train](359/100000) average loss: 0.3274584710597992\n",
      "[proc 1][Train](359/100000) average regularization: 0.014894505962729454\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.075\n",
      "[proc 0][Train](361/100000) average pos_loss: 0.37706297636032104\n",
      "[proc 0][Train](361/100000) average neg_loss: 0.299107164144516\n",
      "[proc 0][Train](361/100000) average loss: 0.3380850553512573\n",
      "[proc 0][Train](361/100000) average regularization: 0.014651893638074398\n",
      "[proc 0][Train] 1 steps take 1.328 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.109\n",
      "[proc 1][Train](360/100000) average pos_loss: 0.34339189529418945\n",
      "[proc 1][Train](360/100000) average neg_loss: 0.6263803243637085\n",
      "[proc 1][Train](360/100000) average loss: 0.484886109828949\n",
      "[proc 1][Train](360/100000) average regularization: 0.01487663947045803\n",
      "[proc 1][Train] 1 steps take 1.296 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.083\n",
      "[proc 0][Train](362/100000) average pos_loss: 0.30903106927871704\n",
      "[proc 0][Train](362/100000) average neg_loss: 0.6779804825782776\n",
      "[proc 0][Train](362/100000) average loss: 0.4935057759284973\n",
      "[proc 0][Train](362/100000) average regularization: 0.014947285875678062\n",
      "[proc 0][Train] 1 steps take 1.317 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.101\n",
      "[proc 1][Train](361/100000) average pos_loss: 0.31724220514297485\n",
      "[proc 1][Train](361/100000) average neg_loss: 0.32584497332572937\n",
      "[proc 1][Train](361/100000) average loss: 0.3215435743331909\n",
      "[proc 1][Train](361/100000) average regularization: 0.014940710738301277\n",
      "[proc 1][Train] 1 steps take 1.306 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.088\n",
      "[proc 0][Train](363/100000) average pos_loss: 0.3574835956096649\n",
      "[proc 0][Train](363/100000) average neg_loss: 0.3259837031364441\n",
      "[proc 0][Train](363/100000) average loss: 0.3417336344718933\n",
      "[proc 0][Train](363/100000) average regularization: 0.014776614494621754\n",
      "[proc 0][Train] 1 steps take 1.314 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.095\n",
      "[proc 1][Train](362/100000) average pos_loss: 0.33418142795562744\n",
      "[proc 1][Train](362/100000) average neg_loss: 0.6203904747962952\n",
      "[proc 1][Train](362/100000) average loss: 0.4772859513759613\n",
      "[proc 1][Train](362/100000) average regularization: 0.01498610619455576\n",
      "[proc 1][Train] 1 steps take 1.319 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.100\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](364/100000) average pos_loss: 0.3093344569206238\n",
      "[proc 0][Train](364/100000) average neg_loss: 0.7264840602874756\n",
      "[proc 0][Train](364/100000) average loss: 0.5179092884063721\n",
      "[proc 0][Train](364/100000) average regularization: 0.01505220215767622\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.090\n",
      "[proc 1][Train](363/100000) average pos_loss: 0.32327595353126526\n",
      "[proc 1][Train](363/100000) average neg_loss: 0.3420897126197815\n",
      "[proc 1][Train](363/100000) average loss: 0.33268284797668457\n",
      "[proc 1][Train](363/100000) average regularization: 0.014894998632371426\n",
      "[proc 1][Train] 1 steps take 1.379 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.160\n",
      "[proc 0][Train](365/100000) average pos_loss: 0.34769296646118164\n",
      "[proc 0][Train](365/100000) average neg_loss: 0.32016903162002563\n",
      "[proc 0][Train](365/100000) average loss: 0.33393099904060364\n",
      "[proc 0][Train](365/100000) average regularization: 0.014814464375376701\n",
      "[proc 0][Train] 1 steps take 1.292 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.075\n",
      "[proc 1][Train](364/100000) average pos_loss: 0.33815810084342957\n",
      "[proc 1][Train](364/100000) average neg_loss: 0.6305868625640869\n",
      "[proc 1][Train](364/100000) average loss: 0.48437249660491943\n",
      "[proc 1][Train](364/100000) average regularization: 0.01483511459082365\n",
      "[proc 1][Train] 1 steps take 1.292 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.074\n",
      "[proc 0][Train](366/100000) average pos_loss: 0.313407838344574\n",
      "[proc 0][Train](366/100000) average neg_loss: 0.7379060983657837\n",
      "[proc 0][Train](366/100000) average loss: 0.5256569385528564\n",
      "[proc 0][Train](366/100000) average regularization: 0.014977562241256237\n",
      "[proc 0][Train] 1 steps take 1.285 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.194, backward: 0.002, update: 1.087\n",
      "[proc 1][Train](365/100000) average pos_loss: 0.32817888259887695\n",
      "[proc 1][Train](365/100000) average neg_loss: 0.3419058918952942\n",
      "[proc 1][Train](365/100000) average loss: 0.33504238724708557\n",
      "[proc 1][Train](365/100000) average regularization: 0.01499909907579422\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.089\n",
      "[proc 0][Train](367/100000) average pos_loss: 0.3607938289642334\n",
      "[proc 0][Train](367/100000) average neg_loss: 0.3009476065635681\n",
      "[proc 0][Train](367/100000) average loss: 0.33087071776390076\n",
      "[proc 0][Train](367/100000) average regularization: 0.014677467755973339\n",
      "[proc 0][Train] 1 steps take 1.320 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.107\n",
      "[proc 1][Train](366/100000) average pos_loss: 0.34510523080825806\n",
      "[proc 1][Train](366/100000) average neg_loss: 0.5786216855049133\n",
      "[proc 1][Train](366/100000) average loss: 0.4618634581565857\n",
      "[proc 1][Train](366/100000) average regularization: 0.01478087529540062\n",
      "[proc 1][Train] 1 steps take 1.295 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.211, backward: 0.003, update: 1.079\n",
      "[proc 0][Train](368/100000) average pos_loss: 0.3086537718772888\n",
      "[proc 0][Train](368/100000) average neg_loss: 0.702294111251831\n",
      "[proc 0][Train](368/100000) average loss: 0.5054739713668823\n",
      "[proc 0][Train](368/100000) average regularization: 0.015058635734021664\n",
      "[proc 0][Train] 1 steps take 1.359 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.145\n",
      "[proc 1][Train](367/100000) average pos_loss: 0.3244536519050598\n",
      "[proc 1][Train](367/100000) average neg_loss: 0.3696667551994324\n",
      "[proc 1][Train](367/100000) average loss: 0.3470602035522461\n",
      "[proc 1][Train](367/100000) average regularization: 0.014905792661011219\n",
      "[proc 1][Train] 1 steps take 1.326 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.116\n",
      "[proc 0][Train](369/100000) average pos_loss: 0.35436445474624634\n",
      "[proc 0][Train](369/100000) average neg_loss: 0.3478696346282959\n",
      "[proc 0][Train](369/100000) average loss: 0.3511170446872711\n",
      "[proc 0][Train](369/100000) average regularization: 0.014762175269424915\n",
      "[proc 0][Train] 1 steps take 1.380 seconds\n",
      "[proc 0]sample: 0.104, forward: 0.214, backward: 0.003, update: 1.059\n",
      "[proc 1][Train](368/100000) average pos_loss: 0.34164512157440186\n",
      "[proc 1][Train](368/100000) average neg_loss: 0.6362593173980713\n",
      "[proc 1][Train](368/100000) average loss: 0.4889522194862366\n",
      "[proc 1][Train](368/100000) average regularization: 0.014858106151223183\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.215, backward: 0.002, update: 1.098\n",
      "[proc 0][Train](370/100000) average pos_loss: 0.3242582678794861\n",
      "[proc 0][Train](370/100000) average neg_loss: 0.7111499905586243\n",
      "[proc 0][Train](370/100000) average loss: 0.5177041292190552\n",
      "[proc 0][Train](370/100000) average regularization: 0.015036084689199924\n",
      "[proc 0][Train] 1 steps take 1.309 seconds\n",
      "[proc 0]sample: 0.020, forward: 0.214, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](369/100000) average pos_loss: 0.3363400995731354\n",
      "[proc 1][Train](369/100000) average neg_loss: 0.31228262186050415\n",
      "[proc 1][Train](369/100000) average loss: 0.32431137561798096\n",
      "[proc 1][Train](369/100000) average regularization: 0.01479349099099636\n",
      "[proc 1][Train] 1 steps take 1.428 seconds\n",
      "[proc 1]sample: 0.106, forward: 0.213, backward: 0.003, update: 1.105\n",
      "[proc 0][Train](371/100000) average pos_loss: 0.3674428462982178\n",
      "[proc 0][Train](371/100000) average neg_loss: 0.3222047984600067\n",
      "[proc 0][Train](371/100000) average loss: 0.34482383728027344\n",
      "[proc 0][Train](371/100000) average regularization: 0.014554418623447418\n",
      "[proc 0][Train] 1 steps take 1.310 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.091\n",
      "[proc 1][Train](370/100000) average pos_loss: 0.33256006240844727\n",
      "[proc 1][Train](370/100000) average neg_loss: 0.6082106828689575\n",
      "[proc 1][Train](370/100000) average loss: 0.4703853726387024\n",
      "[proc 1][Train](370/100000) average regularization: 0.014872340485453606\n",
      "[proc 1][Train] 1 steps take 1.354 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.208, backward: 0.002, update: 1.126\n",
      "[proc 0][Train](372/100000) average pos_loss: 0.31037312746047974\n",
      "[proc 0][Train](372/100000) average neg_loss: 0.7567134499549866\n",
      "[proc 0][Train](372/100000) average loss: 0.5335432887077332\n",
      "[proc 0][Train](372/100000) average regularization: 0.014981534332036972\n",
      "[proc 0][Train] 1 steps take 1.354 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.140\n",
      "[proc 1][Train](371/100000) average pos_loss: 0.3193630874156952\n",
      "[proc 1][Train](371/100000) average neg_loss: 0.31519055366516113\n",
      "[proc 1][Train](371/100000) average loss: 0.31727683544158936\n",
      "[proc 1][Train](371/100000) average regularization: 0.014980915933847427\n",
      "[proc 1][Train] 1 steps take 1.312 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.096\n",
      "[proc 0][Train](373/100000) average pos_loss: 0.3638325333595276\n",
      "[proc 0][Train](373/100000) average neg_loss: 0.3162345588207245\n",
      "[proc 0][Train](373/100000) average loss: 0.34003353118896484\n",
      "[proc 0][Train](373/100000) average regularization: 0.01461983472108841\n",
      "[proc 0][Train] 1 steps take 1.330 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.111\n",
      "[proc 1][Train](372/100000) average pos_loss: 0.3416579067707062\n",
      "[proc 1][Train](372/100000) average neg_loss: 0.5956084728240967\n",
      "[proc 1][Train](372/100000) average loss: 0.46863317489624023\n",
      "[proc 1][Train](372/100000) average regularization: 0.01483200490474701\n",
      "[proc 1][Train] 1 steps take 1.281 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.206, backward: 0.004, update: 1.071\n",
      "[proc 0][Train](374/100000) average pos_loss: 0.31798410415649414\n",
      "[proc 0][Train](374/100000) average neg_loss: 0.699325680732727\n",
      "[proc 0][Train](374/100000) average loss: 0.5086548924446106\n",
      "[proc 0][Train](374/100000) average regularization: 0.014885973185300827\n",
      "[proc 0][Train] 1 steps take 1.334 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.125\n",
      "[proc 1][Train](373/100000) average pos_loss: 0.32308122515678406\n",
      "[proc 1][Train](373/100000) average neg_loss: 0.33474141359329224\n",
      "[proc 1][Train](373/100000) average loss: 0.32891130447387695\n",
      "[proc 1][Train](373/100000) average regularization: 0.015012867748737335\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.204, backward: 0.003, update: 1.103\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](375/100000) average pos_loss: 0.351129949092865\n",
      "[proc 0][Train](375/100000) average neg_loss: 0.33780384063720703\n",
      "[proc 0][Train](375/100000) average loss: 0.344466894865036\n",
      "[proc 0][Train](375/100000) average regularization: 0.014717132784426212\n",
      "[proc 0][Train] 1 steps take 1.285 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.004, update: 1.069\n",
      "[proc 1][Train](374/100000) average pos_loss: 0.335557222366333\n",
      "[proc 1][Train](374/100000) average neg_loss: 0.6496797204017639\n",
      "[proc 1][Train](374/100000) average loss: 0.49261847138404846\n",
      "[proc 1][Train](374/100000) average regularization: 0.01479124091565609\n",
      "[proc 1][Train] 1 steps take 1.299 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.083\n",
      "[proc 0][Train](376/100000) average pos_loss: 0.3148728311061859\n",
      "[proc 0][Train](376/100000) average neg_loss: 0.7018451690673828\n",
      "[proc 0][Train](376/100000) average loss: 0.5083590149879456\n",
      "[proc 0][Train](376/100000) average regularization: 0.014997434802353382\n",
      "[proc 0][Train] 1 steps take 1.351 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.137\n",
      "[proc 1][Train](375/100000) average pos_loss: 0.3323338031768799\n",
      "[proc 1][Train](375/100000) average neg_loss: 0.3518834710121155\n",
      "[proc 1][Train](375/100000) average loss: 0.3421086370944977\n",
      "[proc 1][Train](375/100000) average regularization: 0.014847717247903347\n",
      "[proc 1][Train] 1 steps take 1.324 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.103\n",
      "[proc 0][Train](377/100000) average pos_loss: 0.35937362909317017\n",
      "[proc 0][Train](377/100000) average neg_loss: 0.28386396169662476\n",
      "[proc 0][Train](377/100000) average loss: 0.32161879539489746\n",
      "[proc 0][Train](377/100000) average regularization: 0.01470684539526701\n",
      "[proc 0][Train] 1 steps take 1.323 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.211, backward: 0.002, update: 1.109\n",
      "[proc 1][Train](376/100000) average pos_loss: 0.3383282423019409\n",
      "[proc 1][Train](376/100000) average neg_loss: 0.5690481662750244\n",
      "[proc 1][Train](376/100000) average loss: 0.45368820428848267\n",
      "[proc 1][Train](376/100000) average regularization: 0.014818884432315826\n",
      "[proc 1][Train] 1 steps take 1.317 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.002, update: 1.100\n",
      "[proc 0][Train](378/100000) average pos_loss: 0.30733057856559753\n",
      "[proc 0][Train](378/100000) average neg_loss: 0.7122448682785034\n",
      "[proc 0][Train](378/100000) average loss: 0.5097877383232117\n",
      "[proc 0][Train](378/100000) average regularization: 0.01489933580160141\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.088\n",
      "[proc 1][Train](377/100000) average pos_loss: 0.3137821555137634\n",
      "[proc 1][Train](377/100000) average neg_loss: 0.3811507225036621\n",
      "[proc 1][Train](377/100000) average loss: 0.34746643900871277\n",
      "[proc 1][Train](377/100000) average regularization: 0.015042439103126526\n",
      "[proc 1][Train] 1 steps take 1.306 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.213, backward: 0.003, update: 1.088\n",
      "[proc 0][Train](379/100000) average pos_loss: 0.35279566049575806\n",
      "[proc 0][Train](379/100000) average neg_loss: 0.31860658526420593\n",
      "[proc 0][Train](379/100000) average loss: 0.3357011079788208\n",
      "[proc 0][Train](379/100000) average regularization: 0.014819273725152016\n",
      "[proc 0][Train] 1 steps take 1.332 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.115\n",
      "[proc 1][Train](378/100000) average pos_loss: 0.33592215180397034\n",
      "[proc 1][Train](378/100000) average neg_loss: 0.7062503099441528\n",
      "[proc 1][Train](378/100000) average loss: 0.5210862159729004\n",
      "[proc 1][Train](378/100000) average regularization: 0.014811873435974121\n",
      "[proc 1][Train] 1 steps take 1.326 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.223, backward: 0.003, update: 1.098\n",
      "[proc 0][Train](380/100000) average pos_loss: 0.32320883870124817\n",
      "[proc 0][Train](380/100000) average neg_loss: 0.6628085374832153\n",
      "[proc 0][Train](380/100000) average loss: 0.49300867319107056\n",
      "[proc 0][Train](380/100000) average regularization: 0.014987065456807613\n",
      "[proc 0][Train] 1 steps take 1.298 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.081\n",
      "[proc 1][Train](379/100000) average pos_loss: 0.3411405086517334\n",
      "[proc 1][Train](379/100000) average neg_loss: 0.30989375710487366\n",
      "[proc 1][Train](379/100000) average loss: 0.32551711797714233\n",
      "[proc 1][Train](379/100000) average regularization: 0.014847889542579651\n",
      "[proc 1][Train] 1 steps take 1.299 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.080\n",
      "[proc 0][Train](381/100000) average pos_loss: 0.3628079891204834\n",
      "[proc 0][Train](381/100000) average neg_loss: 0.3120710849761963\n",
      "[proc 0][Train](381/100000) average loss: 0.33743953704833984\n",
      "[proc 0][Train](381/100000) average regularization: 0.014649211429059505\n",
      "[proc 0][Train] 1 steps take 1.305 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.211, backward: 0.003, update: 1.089\n",
      "[proc 1][Train](380/100000) average pos_loss: 0.3318904936313629\n",
      "[proc 1][Train](380/100000) average neg_loss: 0.6264711618423462\n",
      "[proc 1][Train](380/100000) average loss: 0.47918081283569336\n",
      "[proc 1][Train](380/100000) average regularization: 0.014787767082452774\n",
      "[proc 1][Train] 1 steps take 1.308 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.199, backward: 0.003, update: 1.105\n",
      "[proc 0][Train](382/100000) average pos_loss: 0.30824536085128784\n",
      "[proc 0][Train](382/100000) average neg_loss: 0.7213402390480042\n",
      "[proc 0][Train](382/100000) average loss: 0.514792799949646\n",
      "[proc 0][Train](382/100000) average regularization: 0.014943559654057026\n",
      "[proc 0][Train] 1 steps take 1.292 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.073\n",
      "[proc 1][Train](381/100000) average pos_loss: 0.3239172697067261\n",
      "[proc 1][Train](381/100000) average neg_loss: 0.3539011478424072\n",
      "[proc 1][Train](381/100000) average loss: 0.33890920877456665\n",
      "[proc 1][Train](381/100000) average regularization: 0.014725139364600182\n",
      "[proc 1][Train] 1 steps take 1.311 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.094\n",
      "[proc 0][Train](383/100000) average pos_loss: 0.3585684299468994\n",
      "[proc 0][Train](383/100000) average neg_loss: 0.3350915014743805\n",
      "[proc 0][Train](383/100000) average loss: 0.34682995080947876\n",
      "[proc 0][Train](383/100000) average regularization: 0.014705417677760124\n",
      "[proc 0][Train] 1 steps take 1.325 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.107\n",
      "[proc 1][Train](382/100000) average pos_loss: 0.3446824848651886\n",
      "[proc 1][Train](382/100000) average neg_loss: 0.5994493961334229\n",
      "[proc 1][Train](382/100000) average loss: 0.47206592559814453\n",
      "[proc 1][Train](382/100000) average regularization: 0.014831092208623886\n",
      "[proc 1][Train] 1 steps take 1.277 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.201, backward: 0.003, update: 1.071\n",
      "[proc 0][Train](384/100000) average pos_loss: 0.3216727375984192\n",
      "[proc 0][Train](384/100000) average neg_loss: 0.668701171875\n",
      "[proc 0][Train](384/100000) average loss: 0.4951869547367096\n",
      "[proc 0][Train](384/100000) average regularization: 0.015002570115029812\n",
      "[proc 0][Train] 1 steps take 1.366 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.212, backward: 0.003, update: 1.150\n",
      "[proc 1][Train](383/100000) average pos_loss: 0.33425021171569824\n",
      "[proc 1][Train](383/100000) average neg_loss: 0.3167117238044739\n",
      "[proc 1][Train](383/100000) average loss: 0.32548096776008606\n",
      "[proc 1][Train](383/100000) average regularization: 0.014828021638095379\n",
      "[proc 1][Train] 1 steps take 1.309 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.197, backward: 0.002, update: 1.109\n",
      "[proc 0][Train](385/100000) average pos_loss: 0.3534003496170044\n",
      "[proc 0][Train](385/100000) average neg_loss: 0.31751856207847595\n",
      "[proc 0][Train](385/100000) average loss: 0.33545947074890137\n",
      "[proc 0][Train](385/100000) average regularization: 0.014581509865820408\n",
      "[proc 0][Train] 1 steps take 1.338 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.202, backward: 0.003, update: 1.116\n",
      "[proc 1][Train](384/100000) average pos_loss: 0.3264366090297699\n",
      "[proc 1][Train](384/100000) average neg_loss: 0.67528235912323\n",
      "[proc 1][Train](384/100000) average loss: 0.5008594989776611\n",
      "[proc 1][Train](384/100000) average regularization: 0.014849941246211529\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.091\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](386/100000) average pos_loss: 0.30655306577682495\n",
      "[proc 0][Train](386/100000) average neg_loss: 0.6838657259941101\n",
      "[proc 0][Train](386/100000) average loss: 0.49520939588546753\n",
      "[proc 0][Train](386/100000) average regularization: 0.014980941079556942\n",
      "[proc 0][Train] 1 steps take 1.339 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.207, backward: 0.003, update: 1.112\n",
      "[proc 1][Train](385/100000) average pos_loss: 0.3289732038974762\n",
      "[proc 1][Train](385/100000) average neg_loss: 0.34738147258758545\n",
      "[proc 1][Train](385/100000) average loss: 0.33817732334136963\n",
      "[proc 1][Train](385/100000) average regularization: 0.01479216106235981\n",
      "[proc 1][Train] 1 steps take 1.344 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.213, backward: 0.002, update: 1.111\n",
      "[proc 0][Train](387/100000) average pos_loss: 0.36554068326950073\n",
      "[proc 0][Train](387/100000) average neg_loss: 0.30531740188598633\n",
      "[proc 0][Train](387/100000) average loss: 0.33542904257774353\n",
      "[proc 0][Train](387/100000) average regularization: 0.014685220085084438\n",
      "[proc 0][Train] 1 steps take 1.259 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.212, backward: 0.003, update: 1.042\n",
      "[proc 1][Train](386/100000) average pos_loss: 0.35362765192985535\n",
      "[proc 1][Train](386/100000) average neg_loss: 0.6516985893249512\n",
      "[proc 1][Train](386/100000) average loss: 0.5026631355285645\n",
      "[proc 1][Train](386/100000) average regularization: 0.014759598299860954\n",
      "[proc 1][Train] 1 steps take 1.330 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.215, backward: 0.002, update: 1.094\n",
      "[proc 0][Train](388/100000) average pos_loss: 0.31522858142852783\n",
      "[proc 0][Train](388/100000) average neg_loss: 0.6877937316894531\n",
      "[proc 0][Train](388/100000) average loss: 0.5015111565589905\n",
      "[proc 0][Train](388/100000) average regularization: 0.014885645359754562\n",
      "[proc 0][Train] 1 steps take 1.314 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.003, update: 1.101\n",
      "[proc 1][Train](387/100000) average pos_loss: 0.3320743441581726\n",
      "[proc 1][Train](387/100000) average neg_loss: 0.3216269016265869\n",
      "[proc 1][Train](387/100000) average loss: 0.32685062289237976\n",
      "[proc 1][Train](387/100000) average regularization: 0.014917002990841866\n",
      "[proc 1][Train] 1 steps take 1.310 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.091\n",
      "[proc 0][Train](389/100000) average pos_loss: 0.36643552780151367\n",
      "[proc 0][Train](389/100000) average neg_loss: 0.306925505399704\n",
      "[proc 0][Train](389/100000) average loss: 0.33668053150177\n",
      "[proc 0][Train](389/100000) average regularization: 0.01459190808236599\n",
      "[proc 0][Train] 1 steps take 1.313 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.222, backward: 0.003, update: 1.087\n",
      "[proc 1][Train](388/100000) average pos_loss: 0.3344643712043762\n",
      "[proc 1][Train](388/100000) average neg_loss: 0.6266297698020935\n",
      "[proc 1][Train](388/100000) average loss: 0.48054707050323486\n",
      "[proc 1][Train](388/100000) average regularization: 0.014719043858349323\n",
      "[proc 1][Train] 1 steps take 1.356 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.252, backward: 0.002, update: 1.100\n",
      "[proc 0][Train](390/100000) average pos_loss: 0.3068245053291321\n",
      "[proc 0][Train](390/100000) average neg_loss: 0.6885492205619812\n",
      "[proc 0][Train](390/100000) average loss: 0.49768686294555664\n",
      "[proc 0][Train](390/100000) average regularization: 0.014971339143812656\n",
      "[proc 0][Train] 1 steps take 1.278 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.206, backward: 0.003, update: 1.068\n",
      "[proc 1][Train](389/100000) average pos_loss: 0.3254265785217285\n",
      "[proc 1][Train](389/100000) average neg_loss: 0.36604732275009155\n",
      "[proc 1][Train](389/100000) average loss: 0.34573695063591003\n",
      "[proc 1][Train](389/100000) average regularization: 0.014938678592443466\n",
      "[proc 1][Train] 1 steps take 1.341 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.125\n",
      "[proc 0][Train](391/100000) average pos_loss: 0.35558661818504333\n",
      "[proc 0][Train](391/100000) average neg_loss: 0.31503117084503174\n",
      "[proc 0][Train](391/100000) average loss: 0.33530890941619873\n",
      "[proc 0][Train](391/100000) average regularization: 0.014759238809347153\n",
      "[proc 0][Train] 1 steps take 1.317 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.098\n",
      "[proc 1][Train](390/100000) average pos_loss: 0.3486650586128235\n",
      "[proc 1][Train](390/100000) average neg_loss: 0.5824657678604126\n",
      "[proc 1][Train](390/100000) average loss: 0.46556541323661804\n",
      "[proc 1][Train](390/100000) average regularization: 0.014734920114278793\n",
      "[proc 1][Train] 1 steps take 1.279 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.062\n",
      "[proc 0][Train](392/100000) average pos_loss: 0.3196602761745453\n",
      "[proc 0][Train](392/100000) average neg_loss: 0.722179651260376\n",
      "[proc 0][Train](392/100000) average loss: 0.5209199786186218\n",
      "[proc 0][Train](392/100000) average regularization: 0.015027482993900776\n",
      "[proc 0][Train] 1 steps take 1.291 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.214, backward: 0.003, update: 1.072\n",
      "[proc 1][Train](391/100000) average pos_loss: 0.33585965633392334\n",
      "[proc 1][Train](391/100000) average neg_loss: 0.32081177830696106\n",
      "[proc 1][Train](391/100000) average loss: 0.328335702419281\n",
      "[proc 1][Train](391/100000) average regularization: 0.014871695078909397\n",
      "[proc 1][Train] 1 steps take 1.279 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.201, backward: 0.003, update: 1.074\n",
      "[proc 0][Train](393/100000) average pos_loss: 0.3655742406845093\n",
      "[proc 0][Train](393/100000) average neg_loss: 0.33786237239837646\n",
      "[proc 0][Train](393/100000) average loss: 0.35171830654144287\n",
      "[proc 0][Train](393/100000) average regularization: 0.014663001522421837\n",
      "[proc 0][Train] 1 steps take 1.302 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.087\n",
      "[proc 1][Train](392/100000) average pos_loss: 0.32972779870033264\n",
      "[proc 1][Train](392/100000) average neg_loss: 0.6137917637825012\n",
      "[proc 1][Train](392/100000) average loss: 0.4717597961425781\n",
      "[proc 1][Train](392/100000) average regularization: 0.014834031462669373\n",
      "[proc 1][Train] 1 steps take 1.297 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.002, update: 1.078\n",
      "[proc 0][Train](394/100000) average pos_loss: 0.3073943257331848\n",
      "[proc 0][Train](394/100000) average neg_loss: 0.716559112071991\n",
      "[proc 0][Train](394/100000) average loss: 0.5119767189025879\n",
      "[proc 0][Train](394/100000) average regularization: 0.014962896704673767\n",
      "[proc 0][Train] 1 steps take 1.374 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.157\n",
      "[proc 1][Train](393/100000) average pos_loss: 0.3283412754535675\n",
      "[proc 1][Train](393/100000) average neg_loss: 0.3504464328289032\n",
      "[proc 1][Train](393/100000) average loss: 0.33939385414123535\n",
      "[proc 1][Train](393/100000) average regularization: 0.014925383031368256\n",
      "[proc 1][Train] 1 steps take 1.305 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.002, update: 1.088\n",
      "[proc 0][Train](395/100000) average pos_loss: 0.35180431604385376\n",
      "[proc 0][Train](395/100000) average neg_loss: 0.33418530225753784\n",
      "[proc 0][Train](395/100000) average loss: 0.3429948091506958\n",
      "[proc 0][Train](395/100000) average regularization: 0.014571931213140488\n",
      "[proc 0][Train] 1 steps take 1.267 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.205, backward: 0.003, update: 1.058\n",
      "[proc 1][Train](394/100000) average pos_loss: 0.34689462184906006\n",
      "[proc 1][Train](394/100000) average neg_loss: 0.635013222694397\n",
      "[proc 1][Train](394/100000) average loss: 0.4909539222717285\n",
      "[proc 1][Train](394/100000) average regularization: 0.014782064594328403\n",
      "[proc 1][Train] 1 steps take 1.310 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.091\n",
      "[proc 0][Train](396/100000) average pos_loss: 0.32247108221054077\n",
      "[proc 0][Train](396/100000) average neg_loss: 0.7072861790657043\n",
      "[proc 0][Train](396/100000) average loss: 0.5148786306381226\n",
      "[proc 0][Train](396/100000) average regularization: 0.014840087853372097\n",
      "[proc 0][Train] 1 steps take 1.354 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.135\n",
      "[proc 1][Train](395/100000) average pos_loss: 0.33066219091415405\n",
      "[proc 1][Train](395/100000) average neg_loss: 0.34354162216186523\n",
      "[proc 1][Train](395/100000) average loss: 0.33710190653800964\n",
      "[proc 1][Train](395/100000) average regularization: 0.014784668572247028\n",
      "[proc 1][Train] 1 steps take 1.335 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.206, backward: 0.002, update: 1.125\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](397/100000) average pos_loss: 0.3628108501434326\n",
      "[proc 0][Train](397/100000) average neg_loss: 0.3102508783340454\n",
      "[proc 0][Train](397/100000) average loss: 0.336530864238739\n",
      "[proc 0][Train](397/100000) average regularization: 0.014613877050578594\n",
      "[proc 0][Train] 1 steps take 1.308 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.217, backward: 0.003, update: 1.086\n",
      "[proc 1][Train](396/100000) average pos_loss: 0.3385545611381531\n",
      "[proc 1][Train](396/100000) average neg_loss: 0.6132030487060547\n",
      "[proc 1][Train](396/100000) average loss: 0.4758788049221039\n",
      "[proc 1][Train](396/100000) average regularization: 0.0146955456584692\n",
      "[proc 1][Train] 1 steps take 1.316 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.219, backward: 0.002, update: 1.094\n",
      "[proc 0][Train](398/100000) average pos_loss: 0.31479692459106445\n",
      "[proc 0][Train](398/100000) average neg_loss: 0.6740021705627441\n",
      "[proc 0][Train](398/100000) average loss: 0.4943995475769043\n",
      "[proc 0][Train](398/100000) average regularization: 0.014962589368224144\n",
      "[proc 0][Train] 1 steps take 1.345 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.206, backward: 0.004, update: 1.134\n",
      "[proc 1][Train](397/100000) average pos_loss: 0.3266875147819519\n",
      "[proc 1][Train](397/100000) average neg_loss: 0.3402339816093445\n",
      "[proc 1][Train](397/100000) average loss: 0.3334607481956482\n",
      "[proc 1][Train](397/100000) average regularization: 0.014778473414480686\n",
      "[proc 1][Train] 1 steps take 1.429 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.211, backward: 0.002, update: 1.215\n",
      "[proc 0][Train](399/100000) average pos_loss: 0.35248202085494995\n",
      "[proc 0][Train](399/100000) average neg_loss: 0.3322632312774658\n",
      "[proc 0][Train](399/100000) average loss: 0.3423726260662079\n",
      "[proc 0][Train](399/100000) average regularization: 0.014788449741899967\n",
      "[proc 0][Train] 1 steps take 1.247 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.206, backward: 0.003, update: 1.037\n",
      "[proc 1][Train](398/100000) average pos_loss: 0.33609190583229065\n",
      "[proc 1][Train](398/100000) average neg_loss: 0.6513718366622925\n",
      "[proc 1][Train](398/100000) average loss: 0.49373185634613037\n",
      "[proc 1][Train](398/100000) average regularization: 0.014750650152564049\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.200, backward: 0.003, update: 1.102\n",
      "[proc 0][Train](400/100000) average pos_loss: 0.3125128149986267\n",
      "[proc 0][Train](400/100000) average neg_loss: 0.7173175811767578\n",
      "[proc 0][Train](400/100000) average loss: 0.5149152278900146\n",
      "[proc 0][Train](400/100000) average regularization: 0.014932595193386078\n",
      "[proc 0][Train] 1 steps take 1.337 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.200, backward: 0.003, update: 1.134\n",
      "[proc 1][Train](399/100000) average pos_loss: 0.33831894397735596\n",
      "[proc 1][Train](399/100000) average neg_loss: 0.31320130825042725\n",
      "[proc 1][Train](399/100000) average loss: 0.3257601261138916\n",
      "[proc 1][Train](399/100000) average regularization: 0.014712311327457428\n",
      "[proc 1][Train] 1 steps take 1.291 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.075\n",
      "[proc 0][Train](401/100000) average pos_loss: 0.3683345317840576\n",
      "[proc 0][Train](401/100000) average neg_loss: 0.31748878955841064\n",
      "[proc 0][Train](401/100000) average loss: 0.34291166067123413\n",
      "[proc 0][Train](401/100000) average regularization: 0.014601526781916618\n",
      "[proc 0][Train] 1 steps take 1.326 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.211, backward: 0.003, update: 1.096\n",
      "[proc 1][Train](400/100000) average pos_loss: 0.3389911651611328\n",
      "[proc 1][Train](400/100000) average neg_loss: 0.6251298785209656\n",
      "[proc 1][Train](400/100000) average loss: 0.4820605218410492\n",
      "[proc 1][Train](400/100000) average regularization: 0.014872587285935879\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.090\n",
      "[proc 0][Train](402/100000) average pos_loss: 0.31268638372421265\n",
      "[proc 0][Train](402/100000) average neg_loss: 0.7146971821784973\n",
      "[proc 0][Train](402/100000) average loss: 0.513691782951355\n",
      "[proc 0][Train](402/100000) average regularization: 0.014888059347867966\n",
      "[proc 0][Train] 1 steps take 1.326 seconds\n",
      "[proc 0]sample: 0.020, forward: 0.215, backward: 0.002, update: 1.089\n",
      "[proc 1][Train](401/100000) average pos_loss: 0.3370759189128876\n",
      "[proc 1][Train](401/100000) average neg_loss: 0.3278174102306366\n",
      "[proc 1][Train](401/100000) average loss: 0.3324466645717621\n",
      "[proc 1][Train](401/100000) average regularization: 0.014831739477813244\n",
      "[proc 1][Train] 1 steps take 1.287 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.196, backward: 0.002, update: 1.075\n",
      "[proc 0][Train](403/100000) average pos_loss: 0.36069029569625854\n",
      "[proc 0][Train](403/100000) average neg_loss: 0.30866166949272156\n",
      "[proc 0][Train](403/100000) average loss: 0.33467596769332886\n",
      "[proc 0][Train](403/100000) average regularization: 0.01463295053690672\n",
      "[proc 0][Train] 1 steps take 1.285 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.068\n",
      "[proc 1][Train](402/100000) average pos_loss: 0.33292168378829956\n",
      "[proc 1][Train](402/100000) average neg_loss: 0.651160717010498\n",
      "[proc 1][Train](402/100000) average loss: 0.4920412003993988\n",
      "[proc 1][Train](402/100000) average regularization: 0.014747797511518002\n",
      "[proc 1][Train] 1 steps take 1.295 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.205, backward: 0.002, update: 1.073\n",
      "[proc 0][Train](404/100000) average pos_loss: 0.3070155084133148\n",
      "[proc 0][Train](404/100000) average neg_loss: 0.71962571144104\n",
      "[proc 0][Train](404/100000) average loss: 0.5133206248283386\n",
      "[proc 0][Train](404/100000) average regularization: 0.014942985959351063\n",
      "[proc 0][Train] 1 steps take 1.300 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.082\n",
      "[proc 1][Train](403/100000) average pos_loss: 0.32656747102737427\n",
      "[proc 1][Train](403/100000) average neg_loss: 0.34873709082603455\n",
      "[proc 1][Train](403/100000) average loss: 0.3376522660255432\n",
      "[proc 1][Train](403/100000) average regularization: 0.014810345135629177\n",
      "[proc 1][Train] 1 steps take 1.307 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.088\n",
      "[proc 0][Train](405/100000) average pos_loss: 0.3710017204284668\n",
      "[proc 0][Train](405/100000) average neg_loss: 0.30816054344177246\n",
      "[proc 0][Train](405/100000) average loss: 0.33958113193511963\n",
      "[proc 0][Train](405/100000) average regularization: 0.014675132930278778\n",
      "[proc 0][Train] 1 steps take 1.304 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.089\n",
      "[proc 1][Train](404/100000) average pos_loss: 0.35122233629226685\n",
      "[proc 1][Train](404/100000) average neg_loss: 0.6109632253646851\n",
      "[proc 1][Train](404/100000) average loss: 0.48109278082847595\n",
      "[proc 1][Train](404/100000) average regularization: 0.014666222035884857\n",
      "[proc 1][Train] 1 steps take 1.298 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.081\n",
      "[proc 0][Train](406/100000) average pos_loss: 0.3194050192832947\n",
      "[proc 0][Train](406/100000) average neg_loss: 0.6564139127731323\n",
      "[proc 0][Train](406/100000) average loss: 0.4879094660282135\n",
      "[proc 0][Train](406/100000) average regularization: 0.01472522784024477\n",
      "[proc 0][Train] 1 steps take 1.308 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.090\n",
      "[proc 1][Train](405/100000) average pos_loss: 0.3226541578769684\n",
      "[proc 1][Train](405/100000) average neg_loss: 0.34583860635757446\n",
      "[proc 1][Train](405/100000) average loss: 0.3342463970184326\n",
      "[proc 1][Train](405/100000) average regularization: 0.014793689362704754\n",
      "[proc 1][Train] 1 steps take 1.290 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.203, backward: 0.003, update: 1.082\n",
      "[proc 0][Train](407/100000) average pos_loss: 0.35932374000549316\n",
      "[proc 0][Train](407/100000) average neg_loss: 0.30994170904159546\n",
      "[proc 0][Train](407/100000) average loss: 0.3346327245235443\n",
      "[proc 0][Train](407/100000) average regularization: 0.014676232822239399\n",
      "[proc 0][Train] 1 steps take 1.288 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.071\n",
      "[proc 1][Train](406/100000) average pos_loss: 0.3372627794742584\n",
      "[proc 1][Train](406/100000) average neg_loss: 0.6392453908920288\n",
      "[proc 1][Train](406/100000) average loss: 0.4882540702819824\n",
      "[proc 1][Train](406/100000) average regularization: 0.014698254875838757\n",
      "[proc 1][Train] 1 steps take 1.303 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.083\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](408/100000) average pos_loss: 0.3179245591163635\n",
      "[proc 0][Train](408/100000) average neg_loss: 0.7072863578796387\n",
      "[proc 0][Train](408/100000) average loss: 0.5126054286956787\n",
      "[proc 0][Train](408/100000) average regularization: 0.014890302903950214\n",
      "[proc 0][Train] 1 steps take 1.314 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.201, backward: 0.003, update: 1.110\n",
      "[proc 1][Train](407/100000) average pos_loss: 0.3261300325393677\n",
      "[proc 1][Train](407/100000) average neg_loss: 0.3217121362686157\n",
      "[proc 1][Train](407/100000) average loss: 0.3239210844039917\n",
      "[proc 1][Train](407/100000) average regularization: 0.014736500568687916\n",
      "[proc 1][Train] 1 steps take 1.287 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.200, backward: 0.003, update: 1.083\n",
      "[proc 0][Train](409/100000) average pos_loss: 0.35612234473228455\n",
      "[proc 0][Train](409/100000) average neg_loss: 0.32838892936706543\n",
      "[proc 0][Train](409/100000) average loss: 0.3422556519508362\n",
      "[proc 0][Train](409/100000) average regularization: 0.014675836078822613\n",
      "[proc 0][Train] 1 steps take 1.272 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.003, update: 1.058\n",
      "[proc 1][Train](408/100000) average pos_loss: 0.35099899768829346\n",
      "[proc 1][Train](408/100000) average neg_loss: 0.6808754801750183\n",
      "[proc 1][Train](408/100000) average loss: 0.5159372091293335\n",
      "[proc 1][Train](408/100000) average regularization: 0.014550342224538326\n",
      "[proc 1][Train] 1 steps take 1.288 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.215, backward: 0.003, update: 1.069\n",
      "[proc 0][Train](410/100000) average pos_loss: 0.31036898493766785\n",
      "[proc 0][Train](410/100000) average neg_loss: 0.7131463289260864\n",
      "[proc 0][Train](410/100000) average loss: 0.5117576718330383\n",
      "[proc 0][Train](410/100000) average regularization: 0.014962389133870602\n",
      "[proc 0][Train] 1 steps take 1.280 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.204, backward: 0.003, update: 1.071\n",
      "[proc 1][Train](409/100000) average pos_loss: 0.3391109108924866\n",
      "[proc 1][Train](409/100000) average neg_loss: 0.3215065598487854\n",
      "[proc 1][Train](409/100000) average loss: 0.330308735370636\n",
      "[proc 1][Train](409/100000) average regularization: 0.014531778171658516\n",
      "[proc 1][Train] 1 steps take 1.284 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.196, backward: 0.003, update: 1.084\n",
      "[proc 0][Train](411/100000) average pos_loss: 0.3645052909851074\n",
      "[proc 0][Train](411/100000) average neg_loss: 0.31172889471054077\n",
      "[proc 0][Train](411/100000) average loss: 0.3381170928478241\n",
      "[proc 0][Train](411/100000) average regularization: 0.014588680118322372\n",
      "[proc 0][Train] 1 steps take 1.274 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.204, backward: 0.004, update: 1.064\n",
      "[proc 1][Train](410/100000) average pos_loss: 0.3336029052734375\n",
      "[proc 1][Train](410/100000) average neg_loss: 0.6344667673110962\n",
      "[proc 1][Train](410/100000) average loss: 0.48403483629226685\n",
      "[proc 1][Train](410/100000) average regularization: 0.014738043770194054\n",
      "[proc 1][Train] 1 steps take 1.294 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.074\n",
      "[proc 0][Train](412/100000) average pos_loss: 0.30986863374710083\n",
      "[proc 0][Train](412/100000) average neg_loss: 0.6744900941848755\n",
      "[proc 0][Train](412/100000) average loss: 0.49217936396598816\n",
      "[proc 0][Train](412/100000) average regularization: 0.014761787839233875\n",
      "[proc 0][Train] 1 steps take 1.276 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.004, update: 1.061\n",
      "[proc 1][Train](411/100000) average pos_loss: 0.3327993154525757\n",
      "[proc 1][Train](411/100000) average neg_loss: 0.3304869830608368\n",
      "[proc 1][Train](411/100000) average loss: 0.33164316415786743\n",
      "[proc 1][Train](411/100000) average regularization: 0.014705413021147251\n",
      "[proc 1][Train] 1 steps take 1.300 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.085\n",
      "[proc 0][Train](413/100000) average pos_loss: 0.36450326442718506\n",
      "[proc 0][Train](413/100000) average neg_loss: 0.29567277431488037\n",
      "[proc 0][Train](413/100000) average loss: 0.3300880193710327\n",
      "[proc 0][Train](413/100000) average regularization: 0.014594495296478271\n",
      "[proc 0][Train] 1 steps take 1.315 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.099\n",
      "[proc 1][Train](412/100000) average pos_loss: 0.34214434027671814\n",
      "[proc 1][Train](412/100000) average neg_loss: 0.6982873678207397\n",
      "[proc 1][Train](412/100000) average loss: 0.5202158689498901\n",
      "[proc 1][Train](412/100000) average regularization: 0.014731609262526035\n",
      "[proc 1][Train] 1 steps take 1.304 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.086\n",
      "[proc 0][Train](414/100000) average pos_loss: 0.31000465154647827\n",
      "[proc 0][Train](414/100000) average neg_loss: 0.7280346751213074\n",
      "[proc 0][Train](414/100000) average loss: 0.5190196633338928\n",
      "[proc 0][Train](414/100000) average regularization: 0.014736302196979523\n",
      "[proc 0][Train] 1 steps take 1.284 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.212, backward: 0.004, update: 1.066\n",
      "[proc 1][Train](413/100000) average pos_loss: 0.33496737480163574\n",
      "[proc 1][Train](413/100000) average neg_loss: 0.33142030239105225\n",
      "[proc 1][Train](413/100000) average loss: 0.333193838596344\n",
      "[proc 1][Train](413/100000) average regularization: 0.014742406085133553\n",
      "[proc 1][Train] 1 steps take 1.283 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.208, backward: 0.002, update: 1.071\n",
      "[proc 0][Train](415/100000) average pos_loss: 0.37173593044281006\n",
      "[proc 0][Train](415/100000) average neg_loss: 0.32061344385147095\n",
      "[proc 0][Train](415/100000) average loss: 0.3461746871471405\n",
      "[proc 0][Train](415/100000) average regularization: 0.014507328160107136\n",
      "[proc 0][Train] 1 steps take 1.257 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.198, backward: 0.004, update: 1.054\n",
      "[proc 1][Train](414/100000) average pos_loss: 0.34865736961364746\n",
      "[proc 1][Train](414/100000) average neg_loss: 0.6300801634788513\n",
      "[proc 1][Train](414/100000) average loss: 0.4893687665462494\n",
      "[proc 1][Train](414/100000) average regularization: 0.014536309987306595\n",
      "[proc 1][Train] 1 steps take 1.293 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.076\n",
      "[proc 0][Train](416/100000) average pos_loss: 0.322359561920166\n",
      "[proc 0][Train](416/100000) average neg_loss: 0.6710568070411682\n",
      "[proc 0][Train](416/100000) average loss: 0.4967081844806671\n",
      "[proc 0][Train](416/100000) average regularization: 0.014792276546359062\n",
      "[proc 0][Train] 1 steps take 1.280 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.195, backward: 0.004, update: 1.080\n",
      "[proc 1][Train](415/100000) average pos_loss: 0.3320878744125366\n",
      "[proc 1][Train](415/100000) average neg_loss: 0.3192446231842041\n",
      "[proc 1][Train](415/100000) average loss: 0.32566624879837036\n",
      "[proc 1][Train](415/100000) average regularization: 0.014680856838822365\n",
      "[proc 1][Train] 1 steps take 1.285 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.068\n",
      "[proc 0][Train](417/100000) average pos_loss: 0.3574441373348236\n",
      "[proc 0][Train](417/100000) average neg_loss: 0.3174181878566742\n",
      "[proc 0][Train](417/100000) average loss: 0.3374311625957489\n",
      "[proc 0][Train](417/100000) average regularization: 0.014607410877943039\n",
      "[proc 0][Train] 1 steps take 1.333 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.212, backward: 0.002, update: 1.103\n",
      "[proc 1][Train](416/100000) average pos_loss: 0.33030152320861816\n",
      "[proc 1][Train](416/100000) average neg_loss: 0.6420053839683533\n",
      "[proc 1][Train](416/100000) average loss: 0.4861534535884857\n",
      "[proc 1][Train](416/100000) average regularization: 0.014633215963840485\n",
      "[proc 1][Train] 1 steps take 1.286 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.067\n",
      "[proc 0][Train](418/100000) average pos_loss: 0.3170197010040283\n",
      "[proc 0][Train](418/100000) average neg_loss: 0.6925894618034363\n",
      "[proc 0][Train](418/100000) average loss: 0.5048046112060547\n",
      "[proc 0][Train](418/100000) average regularization: 0.014699566178023815\n",
      "[proc 0][Train] 1 steps take 1.359 seconds\n",
      "[proc 0]sample: 0.019, forward: 0.211, backward: 0.004, update: 1.124\n",
      "[proc 1][Train](417/100000) average pos_loss: 0.33283287286758423\n",
      "[proc 1][Train](417/100000) average neg_loss: 0.32690903544425964\n",
      "[proc 1][Train](417/100000) average loss: 0.32987093925476074\n",
      "[proc 1][Train](417/100000) average regularization: 0.014641831628978252\n",
      "[proc 1][Train] 1 steps take 1.297 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.202, backward: 0.003, update: 1.076\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](419/100000) average pos_loss: 0.36818549036979675\n",
      "[proc 0][Train](419/100000) average neg_loss: 0.3087179958820343\n",
      "[proc 0][Train](419/100000) average loss: 0.3384517431259155\n",
      "[proc 0][Train](419/100000) average regularization: 0.014619143679738045\n",
      "[proc 0][Train] 1 steps take 1.392 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.257, backward: 0.004, update: 1.130\n",
      "[proc 1][Train](418/100000) average pos_loss: 0.3382496237754822\n",
      "[proc 1][Train](418/100000) average neg_loss: 0.6215795278549194\n",
      "[proc 1][Train](418/100000) average loss: 0.4799145758152008\n",
      "[proc 1][Train](418/100000) average regularization: 0.014599700458347797\n",
      "[proc 1][Train] 1 steps take 1.337 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.212, backward: 0.002, update: 1.108\n",
      "[proc 0][Train](420/100000) average pos_loss: 0.3105357885360718\n",
      "[proc 0][Train](420/100000) average neg_loss: 0.6984896659851074\n",
      "[proc 0][Train](420/100000) average loss: 0.5045127272605896\n",
      "[proc 0][Train](420/100000) average regularization: 0.014670128002762794\n",
      "[proc 0][Train] 1 steps take 1.285 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.209, backward: 0.004, update: 1.070\n",
      "[proc 1][Train](419/100000) average pos_loss: 0.32469767332077026\n",
      "[proc 1][Train](419/100000) average neg_loss: 0.35796329379081726\n",
      "[proc 1][Train](419/100000) average loss: 0.34133046865463257\n",
      "[proc 1][Train](419/100000) average regularization: 0.014684502966701984\n",
      "[proc 1][Train] 1 steps take 1.330 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.217, backward: 0.002, update: 1.110\n",
      "[proc 0][Train](421/100000) average pos_loss: 0.36175382137298584\n",
      "[proc 0][Train](421/100000) average neg_loss: 0.3027135729789734\n",
      "[proc 0][Train](421/100000) average loss: 0.3322336971759796\n",
      "[proc 0][Train](421/100000) average regularization: 0.014619922265410423\n",
      "[proc 0][Train] 1 steps take 1.277 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.004, update: 1.059\n",
      "[proc 1][Train](420/100000) average pos_loss: 0.3418591618537903\n",
      "[proc 1][Train](420/100000) average neg_loss: 0.6076704859733582\n",
      "[proc 1][Train](420/100000) average loss: 0.4747648239135742\n",
      "[proc 1][Train](420/100000) average regularization: 0.014687379822134972\n",
      "[proc 1][Train] 1 steps take 1.315 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.214, backward: 0.003, update: 1.096\n",
      "[proc 0][Train](422/100000) average pos_loss: 0.3237980306148529\n",
      "[proc 0][Train](422/100000) average neg_loss: 0.7465062141418457\n",
      "[proc 0][Train](422/100000) average loss: 0.5351521372795105\n",
      "[proc 0][Train](422/100000) average regularization: 0.014710702002048492\n",
      "[proc 0][Train] 1 steps take 1.282 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.212, backward: 0.004, update: 1.065\n",
      "[proc 1][Train](421/100000) average pos_loss: 0.3277166187763214\n",
      "[proc 1][Train](421/100000) average neg_loss: 0.3583308458328247\n",
      "[proc 1][Train](421/100000) average loss: 0.34302371740341187\n",
      "[proc 1][Train](421/100000) average regularization: 0.014690789394080639\n",
      "[proc 1][Train] 1 steps take 1.319 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.216, backward: 0.003, update: 1.099\n",
      "[proc 0][Train](423/100000) average pos_loss: 0.3673321306705475\n",
      "[proc 0][Train](423/100000) average neg_loss: 0.30620551109313965\n",
      "[proc 0][Train](423/100000) average loss: 0.3367688059806824\n",
      "[proc 0][Train](423/100000) average regularization: 0.014549145475029945\n",
      "[proc 0][Train] 1 steps take 1.306 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.210, backward: 0.004, update: 1.090\n",
      "[proc 1][Train](422/100000) average pos_loss: 0.34222352504730225\n",
      "[proc 1][Train](422/100000) average neg_loss: 0.6308215856552124\n",
      "[proc 1][Train](422/100000) average loss: 0.4865225553512573\n",
      "[proc 1][Train](422/100000) average regularization: 0.014591110870242119\n",
      "[proc 1][Train] 1 steps take 1.343 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.212, backward: 0.003, update: 1.127\n",
      "[proc 0][Train](424/100000) average pos_loss: 0.32088208198547363\n",
      "[proc 0][Train](424/100000) average neg_loss: 0.6465108394622803\n",
      "[proc 0][Train](424/100000) average loss: 0.48369646072387695\n",
      "[proc 0][Train](424/100000) average regularization: 0.014659726992249489\n",
      "[proc 0][Train] 1 steps take 1.328 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.213, backward: 0.004, update: 1.109\n",
      "[proc 1][Train](423/100000) average pos_loss: 0.3380753993988037\n",
      "[proc 1][Train](423/100000) average neg_loss: 0.31634482741355896\n",
      "[proc 1][Train](423/100000) average loss: 0.32721012830734253\n",
      "[proc 1][Train](423/100000) average regularization: 0.014557929709553719\n",
      "[proc 1][Train] 1 steps take 1.329 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.112\n",
      "[proc 0][Train](425/100000) average pos_loss: 0.36197906732559204\n",
      "[proc 0][Train](425/100000) average neg_loss: 0.32989501953125\n",
      "[proc 0][Train](425/100000) average loss: 0.345937043428421\n",
      "[proc 0][Train](425/100000) average regularization: 0.014531198889017105\n",
      "[proc 0][Train] 1 steps take 1.292 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.213, backward: 0.003, update: 1.074\n",
      "^C\n",
      "Process Process-1:1:\n",
      "Process Process-2:1:\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/bin/dglke_train\", line 8, in <module>\n",
      "    sys.exit(main())\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dglke/train.py\", line 281, in main\n",
      "    proc.join()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 149, in join\n",
      "    res = self._popen.wait(timeout)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/popen_fork.py\", line 47, in wait\n",
      "    return self.poll(os.WNOHANG if timeout == 0.0 else 0)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/popen_fork.py\", line 27, in poll\n",
      "    pid, sts = os.waitpid(self.pid, flag)\n",
      "KeyboardInterrupt\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 315, in _bootstrap\n",
      "    self.run()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 108, in run\n",
      "    self._target(*self._args, **self._kwargs)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dglke/models/pytorch/tensor_models.py\", line 119, in decorated_function\n",
      "    result, exception, trace = queue.get()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/queues.py\", line 97, in get\n",
      "    res = self._recv_bytes()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 216, in recv_bytes\n",
      "    buf = self._recv_bytes(maxlength)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 414, in _recv_bytes\n",
      "    buf = self._recv(4)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 379, in _recv\n",
      "    chunk = read(handle, remaining)\n",
      "KeyboardInterrupt\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 315, in _bootstrap\n",
      "    self.run()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 108, in run\n",
      "    self._target(*self._args, **self._kwargs)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dglke/models/pytorch/tensor_models.py\", line 119, in decorated_function\n",
      "    result, exception, trace = queue.get()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/queues.py\", line 97, in get\n",
      "    res = self._recv_bytes()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 216, in recv_bytes\n",
      "    buf = self._recv_bytes(maxlength)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 414, in _recv_bytes\n",
      "    buf = self._recv(4)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 379, in _recv\n",
      "    chunk = read(handle, remaining)\n",
      "KeyboardInterrupt\n",
      "Process Process-1:\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 315, in _bootstrap\n",
      "    self.run()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 108, in run\n",
      "    self._target(*self._args, **self._kwargs)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dglke/models/pytorch/tensor_models.py\", line 119, in decorated_function\n",
      "    result, exception, trace = queue.get()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/queues.py\", line 97, in get\n",
      "    res = self._recv_bytes()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 216, in recv_bytes\n",
      "    buf = self._recv_bytes(maxlength)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 414, in _recv_bytes\n",
      "    buf = self._recv(4)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 379, in _recv\n",
      "    chunk = read(handle, remaining)\n",
      "KeyboardInterrupt\n"
     ]
    }
   ],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 6.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 1 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Reading train triples....\r\n"
     ]
    }
   ],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 6.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 6.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 12.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 12.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 12.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 18.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 18.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 18.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 10\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 6.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 11\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 6.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 12\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 6.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 13\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 12.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 14\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 12.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 15\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 12.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 16\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 18.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 17\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 18.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name RESCAL \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 18.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16"
  },
  "vscode": {
   "interpreter": {
    "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
